BUSS6002 - Data Science in Business

Pre-Tutorial Checklist

  1. Install Anaconda on your laptop
  2. Confirm that Anaconda is installed properly
  3. Confirm that Jupyter works
  4. Bring your laptop to class

Tutorial 1 - Getting Started

Welcome to the first tutorial for BUSS6002!

During the tutorials of BUSS6002 we will spend a lot of time on practical application of the lectures.

This means you will be coding (programming) in Python.

You might not have any coding experience and that's ok! Your tutors are here to help you and provide any support you need.

Get Python 3

Anaconda is a package that includes Python and related tools. It is free to download here:

https://www.anaconda.com/download/

Please only use Python 3.

Jupyter Notebooks

Jupyter is a web-based environment for Python. It allows you to mix Python code and text, which makes it perfect for exploratory analysis! We will use Jupyter notebooks throughout the semester. Of course if you prefer you can use a different tool such as Spyder, Pycharm or even a plain text editor!

If you want to know more about the different ways to use Python. Please refer to this weeks PENCAST.

First open Anaconda Navigator then click the “Launch” button for Jupyter

Jupyter will open in your browser.

Click the "New" button to make a new Python notebook.

Python

Hello World

Python is an interpreted language, which means that it can be used interactively!

Try the following code in your own Notebook

In [1]:
print("Hello World!")
Hello World!

What did we just do?

  • print is a function that displays the input on the screen
  • "Hello World!" is a string which is just a piece of text

We will learn more about functions and strings later!

Variables

Python stores data in variables.

You may already be familiar with the concept of variables from algebra.

Let’s start by creating a variable called “x” with the value of “10”.

In [2]:
x = 10

We can check the value of x by printing it in the console

In [3]:
print(x)
10

TIP: You can choose the name of your variables freely, so long as they do not match any existing special keywords or other defined function or library names. Keep your variable names short and meaningful.

Python variables have one major difference from mathematics: variables are not always numeric.

Every variable has a type. The type defines the behavior for the variable.

Python’s numeric data types are:

  • int (Integers)
  • floats (Floating Point Numbers or Decimal Valued Numbers)
  • complex (Complex Numbers)

To represent text, we use the string data type. Strings are denoted by wrapping the text in matching single or double quotation marks.

Let’s experiment with some more variables:

In [4]:
y = 5
z = x + y
In [5]:
print(z)
15
In [6]:
print( type(z) )
<class 'int'>

So far we have created three variables x, y, z. What does creating variables mean? Well it means that Python stores the values for these variables in your computers memory. After running the above code our computers memory will look something like this:

We can check this by using the %whos command

In [7]:
%whos
Variable   Type    Data/Info
----------------------------
x          int     10
y          int     5
z          int     15

Let’s try another example

In [8]:
y = 10.5
z = x + y
In [9]:
print(z)
20.5
In [10]:
print( type(z) )
<class 'float'>

Notice that the type of Z is now a float not an integer and our memory looks like this:

We can do other mathematics operations on numbers:

In [11]:
print(z - 3) # Subtraction
print(z * 5) # Multiplication
print(z / 2) # Division
17.5
102.5
10.25

Note that you can’t mix and match operators between data types. For example it is invalid to add or subtract an integer from a string.

However often data types borrow the concepts of operators from other areas. For example you can join two strings together by using the + operator.

In [12]:
my_string = "Hello World!"
print(my_string)

end_string = " - from Steve"
print(my_string + end_string)
Hello World!
Hello World! - from Steve

Since creating strings places them in our computers memory, an updated model of the computers memory is

Lists

Lists are a way to store an ordered collection of variables or objects. Let’s create a list and store it in the variable my_list.

In [9]:
my_list = [1, 2, 4, 8, 16]
print(my_list)
[1, 2, 4, 8, 16]

List items can be accessed by their index. Python uses zero-indexing which means the first item starts at 0.

Print the value of the first item:

In [14]:
print(my_list[0])
1

To add items to the list you can use the append function

In [15]:
my_list.append(32)

To find out how many items are in the list use the len function

In [16]:
print(len(my_list))
6

To create an empty list, you can use the list constructor function then append items later

In [17]:
my_new_list = list()

my_new_list.append(10)
my_new_list.append(12)

print(my_new_list)
[10, 12]

User Input

The code we have used so far always does the same thing. Useful real code acts on input data. One way to get this data is directly from the user.

Let’s capture the user’s name and display a welcome message:

In [18]:
# Python 3
name = input("Enter your name: ")

print("Welcome " + name)
Enter your name: c
Welcome c

Exercise 1 - User Input

In the above user input example we created the variable name. What is the type of name?

  1. int
  2. str
  3. float
  4. list

Comments

As your program gets more complicated you may want to add some notes to your code to help you remember what your code does or to explain to others who are using your code later.

In Python everything to the right hand side of a # symbol is considered a comment and is ignored by Python.

TIP: Comments can also be used to quickly disable specific lines of code.

Multi line comments can be created by wrapping your comment in a pair of triple quotation marks like so

In [49]:
# This is a single line comment

# Python will ignore the following code
# a = 10

"""
This is
a
multi line
comment
""" 
Out[49]:
'\nThis is\na\nmulti line\ncomment\n'

Conditionals

So far when we press Run the code has been executed line by line. But what about if we want to execute a different piece of code based on the current state or value of a variable?

This is where the conditional statements if, elif and else come into play. These statements allow us to branch the execution path.

In [7]:
# Accept input as str 
number_str = input("Enter a number: ")
# Convert str to integer
number_int = int(number_str)

if number_int > 3:
    print("number is greater than 3")
elif number_int == 3:
    print("number is equal to 3")
else:
    print("number is less than 3")
Enter a number: 5
number is greater than 3

Exercise 2 - Nesting

In the above conditional example we used the following two lines of code to collect the number

number_str = input("Enter a number: ")
number_int = int(number_str)

Can you think of a way to do this in one line of code?

Loops

A fundamental component of computer programming is iteration. In other words the ability to do repeated tasks.

In Python we do this using loops. There are two types of loops: • For loop • While loop

Let’s start with an example of the For loop by iterating over our list of numbers that we created earlier and printing their values

In [10]:
for number in my_list:
    print(number)
1
2
4
8
16

You can also use list indexing to do the same thing. Try this

In [14]:
for i in range(len(my_list)):
    print(my_list[i])
1
2
4
8
16

While loops are a little more complicated. They continue iterating until a conditional statement becomes false. Try the following

In [19]:
count = 0
while (count < len(my_list)):
    print(my_list[count])
    count = count + 1
1
2
4
8
16

Tip: Be careful with loops (particularly while loops) as you may accidentally create an infinite loop which Python can’t escape from. You will need to quit Jupyter to stop an infinite loop.

Exercise 3 - Loops

Can you write a loop to print the numbers from 10 to 20?

Functions

Functions in Python (and other programming languages) are based on functions from mathematics. They encapsulate a piece of repeated functionality and they allow optional input and produce optional output.

Functions are a great way to divide your code into reusable building blocks, which makes your code more readable and may save you time and effort.

Try this example

In [21]:
def add(a, b):
    x = a + b
    return x

c = add(10, 5)
print(c)
15

The function called “add” has a two parameters which we called “a” and “b”. Inside the function Python will replace each occurrence of the parameters with the values that we specified.

TIP: def keyword means “define”

String Formatting

A common task you may encounter is outputting strings that are nicely formatted. You can use f-string's to help you.

f-string's use the curly braces {} to indicate a replacement. If you put the name of a variable inside the braces it will be replaced with the value of the variable.

In [23]:
name = input("What is your name?: ")
year = int(input("What year were you born?: "))

age = 2019 - year

fancy_string = f"Hi {name}, you are {age} years old"

print(fancy_string)
What is your name?: c
What year were you born?: 7
Hi c, you are 2012 years old

Classes and Objects

An object is a container that represents a “thing”. Inside the container there are:

  • Functions
  • Attributes

Imagine we want to represent a person in our program. We might define a “person” class. People have attributes like age, height, weight, name etc. They also “do” things, in other words they have functionality. Some function examples might be: eat, sleep, walk and study.

Lets look at a more practical example: a customer’s bank account. We need to keep track of the balance and which customer it belongs to. We also need to be able to withdraw and deposit money.

In [24]:
class Account(object):
    #A bank customer
    
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance
    
    def __str__(self):
        return "Account for {0}, balance of {1}".format(self.owner, self.balance)
    
    def widthdraw(self, amount):
        self.balance = self.balance - amount
        return self.balance
    
    def deposit(self, amount):
        self.balance = self.balance + amount
        return self.balance

TIP: The functions that start and end with double underscores (__) are special functions in Python. The __init__ function (initialize) is called the class constructor. It gets called when you create the object. The __str__ function (string) is called when printing the object.

Create an instance of the customer class by calling the constructor and specifying a name (use your own name) and starting balance

In [25]:
new_account = Account("Steve", 1000.00)

You can then access the attributes of the object by using the dot notation.

In [26]:
print(new_account.balance)
print(new_account.owner)
1000.0
Steve

We have also explicitly define the string function so you can try

In [ ]:
print(new_account)

Accessing the functions of an object is much the same

In [ ]:
new_account.widthdraw(100)

print(new_account.balance)