Education


Python is a fantastic programming language for the beginner (and everyone else too!). Here is a list of some resources I have found useful in my ongoing experience of learning how to program in Python.

Free Online Tutorials

  • Official Python Website has information for beginners and links for downloading python itself (although if you have a mac or linux/unix/etc it may already be installed!
  • Learn Python the Hard Way emphasizes general-purpose programming skills and has a somewhat sarcastic tone.
  • Codecademy focuses more on web development and has a more friendly tone.

Books

  • Learning Python by Mark Lutz is the book I used myself to learn python. It is very long and goes into probably more detail than the beginner would need about object-oriented programming. There may be other books that are better but this is the only one I am personally familiar with.

Text Editors

This is what you use to write the python programs!

  • IDLE is what I used when I first started. It is included when you download python itself.
  • Text Wrangler is what I currently use on macs
  • Notepad++ is what I currently use on windows machines.
  • If you want to impress your friends, and/or do things on remote servers such as in the Amazon cloud, check out the command line based text editor called Vim. It’s a bit tricky and not recommended for beginners. Vim Tutorial

General Programming Resources

  • Command Line Crash Course is a good quick tutorial on basic shell commands, which are handy in executing python scripts and many more things as well
  • Github is a social, open-source focused platform for hosting code. I highly recommend using git or some other version control system in all programming projects. Feel free to take a look at my personal python sandbox here.
  • Bitbucket is a good alternative to github if you need private repositories
  • Stack Overflow is a question-and-answer site for any programming topic. If you run into difficulties it’s an excellent place to go for answers
  • Although it is controversial, I admit to having used the w3schools SQL tutorial. Understanding SQL is essential to dealing with most databases, a task that will likely come up eventually in one’s programming life

Python Tools

Once you are familiar with basic programming in python, you will find yourself wanting to leverage third party libraries and advanced tools. Here are some cool things to check out.

  • PIP is a package manager for installing third party libraries rapidly from the command line. It also manages dependencies (installs automatically all packages that your package needs to run). It’s not the smoothest thing to install but is very useful once you have it. Here are some packages I recommend (feel free to look them up, I will maybe add links later):
    • pandas, numpy, scipy, and matplotlib for scientific computing (especially statistics and linear algebra) and graphing. These are probably easier to install from the official websites than from pip since they have a lot of fortran and C dependencies.
    • httplib2 for HTTP requests
    • oauth2 for dealing with authentication when interacting with certain application programming interfaces (APIs)
    • simplejson for parsing JSON, a common data transfer language in many APIs and web services
    • nose for automated unit testing (testing your own code)
    • selenium a way to programmatically control a browser through “web-driver” commands (eg, for testing someone else’s website)
    • web.py is a web framework I have used to make a dynamic web site. Django is another popular one but I haven’t tried it personally.
    • cx_Oracle is useful if you need to interact with Oracle databases. It’s hard to install via pip due to complicated C dependencies, so just download the installer from their website instead.
    • sqlite3 is useful for working with SQLite databases. Be careful about the versioning though (it’s easy to confuse the version of the python package and the version of the SQLite database and install the wrong one).
    • xlrd is good for extracting data from Microsoft Excel files
  • iPython is a handy interactive shell with a lot of interesting features
  • iPython Notebooks are a great way to present projects that include code, graphs, mathematical formulas, and other heterogeneous content in an organized fashion. They are similar to the popular Rmarkdown documents in the R statistical computing language.
  • If/when you want to create your own python packages, be sure to check out Virtual Environments for testing the deployment process on your local machine. This is more for advanced users.

Programming Challenges

I have enjoyed trying to solve problems on the following sites as a fun way to build up my programming skills:

  • Project Euler is a puzzle site focusing on mathematical challenges that can be solved with programming. It will help you improve the speed and efficiency of your code. You might need to look up some topics about number theory from time to time.
  • Rosalind is a bioinformatics learning site with great interactive practice problems for learning python

MOOC Sites

This isn’t particularly python-related, but if you want to keep learning more about programming (or really anything else for that matter), the following massive online open course (MOOC) sites are recommended:

  • Coursera is one I have used myself. The Machine Learning class is particularly famous and worth trying.
  • Udacity I haven’t tried but it seems to focus more on practical skills like entrepreneurship than academic topics
  • EdX seems to be similar to Coursera except maybe is less strict about deadlines for course assignments
  • Khan Academy is a friendly site with a lot of nice videos about high-school and college level subjects (eg, if you need to quickly review a particular topic in calculus or basic programming)

I hope these lists of resources were helpful for you. If you think of any that I missed and you realy like, feel free to add a comment and I’ll try to update the post to benefit future readers.

Advertisements

Sometimes, especially when reading news articles, I get the feeling people consider probabilities and odds to be the same thing. For example, here is a Business Insider headline claiming that Nate Silver is predicting “92% odds” of Obama victory. I think what they really mean is that Mr. Silver is predicting a 92% probability of Obama victory. There is a big difference between these two statements! Mathematically, the odds of an event are defined as the probability of the event happening divided by the probability that the event doesn’t happen. So, while a probability can take on any value from 0 to 1 (or, in percentage terms, 0%-100%, the odds can range anywhere from zero to infinity. In fact, when the odds of an event are 1, the probability is only 50%. In the example from above, if the odds of Obama victory were really 92% (=0.92), then the probability of victory would be only 0.48, or 48%. Here are some plots showing the relationship between probability and odds:

Finally, here is another blog post from “Simply Statistics” that illustrates the importance of variance in comparing statistical estimates. The main idea is that if the variance of your estimate is small (ie, that the estimate is very precise), then it could be numerically close to some other comparison value but still be considered “significantly different”.

 

Many people are interested in trying out Google’s new social network Google+, but trying to re-add hundreds of friends is tedious. Facebook deliberately prevents users from exporting the email addresses of their friends directly. However, there is an indirect, legal way to extract all of your contacts from Facebook in a format suitable to be imported into any other social network or email system such as Gmail or Microsoft Outlook.

  1. Set up a free Yahoo Mail account.
  2. Select the “Contacts” tab and click “Import Contacts”
  3. Select “Facebook” and sign in with Facebook credentials.
  4. Yahoo imports the email addresses of all your Facebook friends automatically.
  5. Again on the “Contacts” tab, select “Actions” and click “Export All”
  6. Select “Export All” as Yahoo CSV (comma separated values list).
  7. Save this backup file to your local computer. You can open it in Excel or any text editor.
  8. Now sign into Google+. If you don’t have Google+ yet, sign up for a free Gmail account and skip to step 12.
  9. In Google+, select “Circles” and then “Find and Invite”.
  10. Select Yahoo and sign in with your Yahoo credentials.
  11. Google+ automatically imports all of your friends. If they are already on Google+ you can now add them to a circle. If they are not on Google+, you can send them an email invitation.
  12. If you don’t have Google+ you can also just import your friends into Gmail contacts and they will then appear in Google+ if/when you do sign up. To do this, log into Gmail and select Contacts, then Import Contacts. Gmail will ask you where your CSV file is stored (from step 7). Once it has uploaded, all of your Facebook friends’ email addresses will be in your Gmail contacts.
  13. Note that you can also use the Yahoo contact exporter to transfer Facebook friends’ emails into Microsoft Outlook or any other email/ address book (you might have to export as Vcard or some other format if the CSV option doesn’t work).

And that’s it! I look forward to seeing all of you on Google+.

I recently had a conversation with two friends from work about personal finance and investing. Having read many articles about the precarious state of most American families’ finances, I am trying to discipline myself to spend less than I earn and plan for the future. Here’s my basic strategy:

  1. Build up “emergency savings” cash in a regular savings account, enough to pay all bills for 3-6 months. Bankrate is a good tool to compare offerings by online banks.
  2. Pay off debt, especially student loans since they are the only ones that can’t be discharged by personal bankruptcy.
  3. Put some of each paycheck into a roth IRA (can fund it up to $5000 per year of earned income)
  4. Within the roth IRA, buy index funds once every 2-3 months (and almost never sell, to avoid commission fees) to maintain asset allocation based roughly on the following formula:
  • 30% domestic (US) stocks, such as SPY
  • 15% real estate investment trusts, such as VNQ
  • 15% international developed market stocks (Europe, Japan), such as VEA
  • 12.5% government bonds, such as TLT
  • 12.5% inflation protected government bonds, such as TIP
  • 10% emerging market stocks (Brazil, China, etc), such as VWO
  • 5% commodities/ individual stocks, such as POT

Any dividends/interest/tax refund/etc gets reinvested in roth as well. I use Scottrade as broker but there are a bunch of other cheap ones too.

The reason I prefer index funds over individual stocks is because the risk is associated with the whole economy, rather than an individual company, and I don’t have the skill to predict which company will go up or down in the future (on the other hand, because the variation in returns is smaller, the potential for large gains is also less). Mutual funds have the same features of diversification, but the reason I don’t like them is because they often have high “management fees”. In the 401(k) you have almost no choice but to put up with such fees, which is why I like the roth IRA better. Here are some more articles that explain the pros and cons of index funds:

http://www.nytimes.com/2009/02/22/your-money/stocks-and-bonds/22stra.html

http://www.nytimes.com/2010/02/06/your-money/stocks-and-bonds/06wealth.html

Vanguard is a company that is well-known for its low-fee funds (example: VWO tracks emerging market equities)

Another example of an exchange-traded index fund (ETF) that tracks the S&P 500 index is SPY.

I also like the “motley fool” site for general personal finance advice.

Finally, here are a couple of counter-perspectives:

http://www.fool.com/investing/general/2010/12/08/does-warren-buffett-really-think-index-funds-are-b.aspx

http://www.investopedia.com/articles/stocks/09/reasons-to-avoid-index-funds.asp

 

Recently while doing my laundry, I noticed a cone shaped receptacle at the top of the central shaft of my washing machine. My roommate told me it was for fabric softener, and that the cone holds the liquid in place until the spin cycle, at which point the fast rotation of the shaft causes it to migrate out to the edge of the cone, where it can then drip over the edge into the bottom of the washer. I wondered, given a particular cone steepness, how fast would the shaft need to spin to overcome the force of gravity holding the fabric softener in place? As you can see in the force diagram, we assume the cone is symmetrical with a radius “r” and angle of steepness “θ”. Ignoring friction and viscosity effects, there are two forces acting on the drop of fabric softener: a normal force “N” and gravity, which is simply the mass of the liquid “m” multiplied by the gravitational constant .

At the threshold point where all forces are balanced, the liquid does not accelerate in any direction. Any rotational velocity higher than that will cause the liquid to migrate out of the cone. At the threshold velocity, the vertical component of the normal force equals the gravitational force:

And the horizontal component of the normal force equals the centripetal force:

By solving this two equations simultaneously (by substituting for N), we find that the threshold rotational velocity is determined by:

Where the angle of steepness is somewhere between perfectly flat and vertical:

Surprisingly, the mass of the liquid has no effect on the threshold speed. Considering the cone in my laundry machine has a radius of about 2cm and an angle of steepness around 45° (=π/4), the threshold velocity would be about 0.443 m/s. Dividing this by the circumference of the cone (0.126 m) gives an angular velocity of about 3.5 rotations per second. In other words, the washing machine must rotate at least 3.5 times per second to “push” the fabric softener over the edge of the cone. Seems pretty fast! No wonder washing machines consume so much electricity.

My favorite fruit in the world is Guanábana (Annona muricata, aka “Soursop”).

Guanábana / Annonaceae fruit

A very large guanábana fruit in a Colombian tienda. Photo Credit: jjrestrepoa

Native to Latin America, it also grows in tropical Southeast Asia and Africa. In the Philippines it is called “Guayabano”. The first time I tried it was in a smoothie at the Café Havana in Greenbelt Mall, Makati. Since I returned to the US, I have vainly attempted to find the fruit in grocery stores and Asian markets, although I did get to try it again on a recent trip to Puerto Rico. Right when I was about to give up, however, I discovered that another South American fruit, the cherimoya, is in the same genus and is more widely available (at least here in Charlottesville). Furthermore, one of my coworkers shared with me that there is a fruit called pawpaw found in North American forests that is in the same family. Actually, it’s the tree with the gigantic, deciduous leaves. Kentucky State University has a major pawpaw research program, and offers a great general information site. Here’s our exchange, as well as some comments from my college botany professor:

[ME] I looked up pawpaw and it is indeed in the Annonaceae family. In fact, the genus to which it belongs, Asimina, is one of the only temperate representatives from the family. Most of the other edible fruits in that family are from the tropical genus Annona. Annonaceae, along with Magnoliaceae and the nutmeg family Myristicaceae, are all very closely related in that they are from a “primitive” (ancient) lineage of flowering plants, lacking well defined petals or sepals and with leaves characterized by a “ranalean” odor of aromatic, essential oils when crushed (botanically, they are part of the order Magnoliales).

[STEVE, my coworker] Incidentally, there are lots of pawpaw trees around the lower parts of the Old Rag hike, but I haven’t yet seen one with fruit.  I have seen them with fruit near Sugar Hollow reservoir and, as I mentioned, a few weeks ago [another coworker] brought one in that he picked near the Monticello trail.

[DR. KNOX, my professor] Yes, I’ve had cherimoya in Peru, and it is delicious.  It’s been so long since I’ve eaten it and I only had one infructescence, so I don’t remember much, other than that it was sweet and similar to pawpaw.  In Cherimoya, the many pistils in each flower enlarge until they are packed tightly together to give an accessory fruit that looks like a grenade.  I’ve seen other Annonaceae growing in the wild in Panama and Costa Rica. As for pawpaw, fruit production seems to vary a lot from population to population.  For example, I have watched the many pawpaw tree on our back campus for years, and though they form many flowers, I scarcely ever see fruit. I’ve  wondered if they lack the appropriate genotypes to set seed and fruit, or if conditions for pollination, fertilization, or fruit development are not very good there.  But then downstream about a mile along the Maury the trees usually do produce  fruit.

So, why isn’t this fruit more widely cultivated and consumed? According to the Christian Science Monitor, the fruit’s rapid spoilage rate, poisonous seeds, carrion-fly pollination mechanism doomed its prospects for commercialization. Nevertheless, I hope you all won’t hesitate to try one of the delicious fruits from this fascinating plant family if you get the chance- there might even be one in your own back yard (just omit the seeds, like a watermelon). I think I’ll go eat one right now!

Last night I was reading through Hello World!, an introductory programming book loaned to me by a coworker, and decided to give Python a crack. Here’s my first attempt. Just paste it into a text editor and save as ‘lat2dms.py’. Then run it in IDLE GUI (free download) with F5. If you type lat2dms(###) in IDLE it will convert the decimal degree value ‘###’ to degree-minute-second format (four outputs, including N/S to indicate direction). I originally wrote it to output a concatenated string of those values but couldn’t figure out how to include the degree, minute, or second symbols (eg ‘ and ”). My ultimate goal is to use this as a building block for a script that will import a CSV file with geodata  in one format and output a new CSV file with the same data in a different format, suitable to be used with Excel or any compatible geographic information system. I’m about halfway through the loops chapter, so hopefully that will come in handy. This has been a fun exercise and I hope I can continue learning on the side.

#this is a program to convert a decimal latitude value into degrees minutes seconds format
#by Will Townes 25 AUG. 2009
#https://willtownes.wordpress.com

def lat2dms(ilat):

#this conditional defines whether the lat. is north or south
    if ilat>=0:
        direction = 'North'
    else:
        direction = 'South'

    ilat = abs(ilat)

#here we parse the initial, decimal value into degrees minutes and seconds.
    degrees = int(ilat)
    rawminutes = 60*(ilat-degrees)
    minutes = int(rawminutes)
    seconds = int(60*(rawminutes-minutes))

    if ilat > 90:
        print "ERROR- invalid latitude value. Please enter a value between -90 and 90."
    else:
        print "(degrees, minutes, seconds, direction)"
        return degrees, minutes, seconds, direction