Speeding Python Up With Pypy

I’ve been working on some code that will use some supplied regular expressions to search through log files (I know, regex isn’t that efficient, yadda, yadda, yadda, but these were the requirements). The issue I was running into was that there was a lot of data. For example, I had 10 regexes that would search 36 gzipped files averaging 1.2 million lines each. The real issue was that these logs came in hourly, so if it couldn’t finish searching them all within an hour it was going to get backed up.

Being a good Pythonista, I followed the cardinal rules of:
Get it right.
Test it’s right.
Profile if slow.
Optimize.
Repeat from 2.
http://wiki.python.org/moin/PythonSpeed/PerformanceTips

The problem was, after a while I sort of hit a wall. Nothing I did could make this code appreciably faster (of course this was with my limited knowlegde. I’m sure that more experienced Python programmers could optimize this code a lot better then I can, rewrite the regex bottleneck in C, etc) but I was at the end of my rope.

On thing led to another and I remembered reading about Pypy. Pypy is implementation of Python using a JIT (Just In Time Compiler) and other things that have lost there meaning for me since I did systems programming in college. “What the heck”, I thought, “I’ll give it a try”. Pypy is supposed to be highly compatible with CPython (the regular python implementation) and my code didn’t use any exotic libraries.

So I dumped the tarball on my linux machine, unzipped and ran my unmodified code against it, and DAMN was it fast.

CPython run:

sean@linux1:~/code/python/hourly_alerts$ python alerter.py
Loaded regexes
Processing known_bad.log.gz
Searched 817051 lines in 119.398156166 seconds using filter

Pypy Run

sean@linux1:~/code/python/hourly_alerts$ /home/sean/bin/pypy-1.7/bin/pypy alerter.py
Loaded regexes
Processing known_bad.log.gz
Searched 817051 lines in 51.1275110245 seconds using filter

More then twice as fast! Now I know that it was a totally unscientific test and all, but its great to see such an improvement right away.

I my also try Cython, but that looks like it doesn’t have quite the drop-in functionality of Pypy.

Here are the link to Pypy : http://pypy.org/

9 thoughts on “Speeding Python Up With Pypy

  1. Fantastic blog! Do you have any helpful hints
    for aspiring writers? I’m planning to start my own website soon but I’m a little lost on everything.
    Would you recommend starting with a free platform like WordPress or go for a paid option?
    There are so many choices out there that I’m totally confused .. Any recommendations? Cheers!

  2. I know this site offers quality depending content and additional data,
    is there any other website which gives such things
    in quality?

  3. hey there and thank you for your information – I have certainly picked up something new from right here.
    I did however expertise a few technical issues using this web site, since I experienced to reload the site lots of times previous to I could get it to load correctly.
    I had been wondering if your web host is OK? Not that I’m complaining, but sluggish loading instances times will sometimes affect your placement in google and can damage your high-quality score if ads and marketing with Adwords. Anyway I’m
    adding this RSS to my e-mail and could look out for a lot more of
    your respective exciting content. Ensure that you update this
    again very soon.

  4. I must thank you for the efforts you have put in writing this site.
    I am hoping to check out the same high-grade content from you
    later on as well. In fact, your creative writing abilities
    has motivated me to get my very own site now ;)

  5. Excellent blog! Do you have any hints for aspiring writers?
    I’m hoping to start my own blog soon but I’m a little lost
    on everything. Would you advise starting with a free platform like WordPress or go for a paid option?
    There are so many options out there that I’m completely overwhelmed .. Any suggestions? Cheers!

  6. I really like what you guys are usually up too. This kind of clever work and coverage!
    Keep up the terrific works guys I’ve you guys to my own blogroll.