links for 2008-09-25

Standard

Google App Engine limitations, and how to get around them

Standard

Using Google App Engine has great advantages, but there are also serious limitations in the platform that it’s important to be aware of. Not all applications can be implemented and executed with Google App Engine. Google is working hard on removing a lot of these limitations, and they will probably do so eventually, but in the meantime that isn’t really any help if you are writing a web app today.

This post was originally published in swedish on Mashup.se, my blog about swedish mashups and APIs.

Only Python
If you don’t know Python you don’t have choice if you want to use Google App Engine, you just have to learn. It’s not a difficult programming language to learn if you already know how to program. Django, the Python framework that really speeds up writing applications for Google App Engine, is also quite easy to pick up (tip: Use the Google App Engine helper for Django). Before I started with Google App Engine I hadn’t written a single line pf Python and after a few intense days of concentration and coffee consumption I knew Python, Django and Google App Engine quite well.

Most 3rd party python libraries work perfectly on Google App Engine (Django is just one example), but there are limitations. Only libraries that are 100% Python can be used, so if the library has any code in C it can’t be used on Google App Engine. If a librarry has any code that makes a HTTP request or similar it can’t be used on Google App Engine either. All HTTP requests have to be done via Google App Engines URL Fetch API.

Differences between local dev environment and the production environment
One of the advantages with Google App Engine is that there is a local development server that simulates how Google App Engine works when an application is in production. This makes it easy to develop an application locally and then deploy it to the live servers on Google App Engine. As a developer it is important to pay attention to the differences between executing an application on the local server compared to the production environment. It’s no fun to spend time writing code that just don’t work once deployed.

The biggest difference betweent the two environments are that the requests made to 3rd parties work differently. If you have an application that uses the Delicous API it will work fine locally, but once deployed it wont work at all. The reason is that Delicous is blocking all requests from Google App Engine IP-addresses. Same thing is true for the Twitter API due to some HTTP headers that Google App Engine sets (Twitter claims to have fixed this now, haven’t had a chance to test yet). To avoid these problems you need to test your application often in the live environment, especially when you use APIs to call other services.

Datastore
The Google App Engine datastore has some limitations due to being a distributed database. The most obvious limitation is that there isn’t an “OR” operator in GQL (the Datastore version of SQL), but that is easily handled when coding. A more annoying limitations is that it is not possible to create a new entity (data object) in the Datastore via the Google App Engine dashboard unless there already exists an entity of this type, and that there is a real noticable delay between what one can see in the Datastorer in the Dashboard compared to what really is stored in the Datastore. This makes it very hard to really check what data that is stored in the Datastore in any given moment, which makes debugging more difficult.

No scheduled processes
One of the most restrictive limitations with Google App Engine is that there is no way to start a process, other than via a HTTP request. There is no type of recurring scheduled process (like a cron job) and no triggers or hooks to use to start a process when a special event occures. Almost all web applications need som kind of scheduled process to function correctly – to clean up old data, send emails, consolidate statistics or fetch data from an RSS feed once an hour.

The easiest way to get around this limitation is to create a cron job on another server that calls an URL in the Google App Engine application. If you have a lot of visitors you can also perform the background process as part of a user request. For example you could import data from an RSS feed when a user logs in to your application. This will of course make the user experience slow and there is no guarantee that users will perform the action to trigger the background process att the right times.

No matter which solution is implemented one will quickly bump into the next limitation of Google App Engine, that only short processes are permitted.

Only short processes
Something I have learned the hard way is that Google App Engine only is built to handle applications where a users makes a request and quickly gets an answer back. There is no support for a process that  executes for a longer time, and for the time being this limitations seems to be approximatly 9 secondes/process. After that an exception is thrown and the process killed. It does not end there, even if you are nowhere close to use the assigned CPU resources you can quickly use too many resources with long running processes since they have their own unique resource pool. If you use too many resources you application is shut down for 24 hours, and right now there is no way to buy extra resources.

This is a really serious limitation in Google App Engine that it is really difficult to get around. If you need heavy processes it is recommended that you use Amazon EC2 or something similar. To handle (not get around) the limitation you have to handle exceptions in a nice way and use transactions. More about this in William Vambenepe’s very informative post Emulating a long-running process (and a scheduler) in Google App Engine. He has some tips on how to get around this limitation, even if it is not recommended since you risk that your application is shut down.

More limitations
There are more limitations, read  Google App Engine: The good, the bad, and the ugly? for a longer list. Just keep in mind that some of the limitations mentioned in that post already have been addressed by Google.

To summerize you need to really know what limitations there are in Google App Engine before you spend time and energy on developing an application. If the limitations are not a problem then there is a lot to gain by using Google App Engine.

Using WordPress as a CMS

Standard

I’ve used dozens of Content Manage Systems, both open source versions like Drupal and Joomla and Enterprise level ones as Microsoft CMS and others. They all have one thing in common – they suck (that is a technical term for “providing less than optimal results”). Either they are too complicated or too simplistic, or in some cases, both at the same time. After some experimentation I have found a CMS that works perfectly for most of my needs, and if it do not fit a project I write my own (as I did for WebHostNinja.com). This perfect(ish) CMS is the blogging platform WordPress.

At the moment I am using WordPress as a CMS for several websites. Of course this blog is running on WordPress, not that suprising really since it is a blog. The main site of blendapps.com is all WordPress, while the acctual chat application that integrates Meebo chat with Ning social networks, run on the subdomain chat.blendapps.com and is implemented on Google App Engine. For Blendapps.com there is both a seperate blog section and a content section, so it is by no means a pure blog. Mashup.se is also running WordPress, and there I have tried to have more of a newspaper feel than a pure blog. The purest use of WordPress as a CMS is probably on restillmexico.se (swedish travel guide to Mexico), that site doesn’t even have a blog.

Why WordPress?
My requirements for a CMS are simple – it should be quick to implement a new site, it should be easy to maintain a site and it should be easy to expand and customize the site when it grows. A plus is if it runs on LAMP, so I can hack it if neccessary and so I can host it anywhere. There is not really any need to support multiple concurrent users since it is mostly going to be yours truley doing the work. Not rocket science really, but evidently damn hard to implement since most CMSes miss the mark by miles.

Let’s take a look why WordPress live up to these requirements:

  • Quick to implement a new site – setting up WordPress is a matter of minutes. Get the WordPress files, set up a database and go. For some extra bonus you can add the plugins you need and choose one of 100’s of free well done themes to get the look you want. You can of course also do as I do and get a professional design and just add that on top of an exising theme.
  • Easy to maintain – the WordPress control panel is easy to use, and if you dont like it you can change it around with plugins (see below for some such plugins). WordPress and most plugins are continously updated and to install updates is a breeze. All you need – such as static pages, blog posts, tagging, categories, multiple users, comments, spam protection (Akismet) etc is all part of the basic WordPress install.
  • Easy to expand and customize – again, the plugins are really great for customization. Also, WordPress have really powerful and usefull template tags that allows you to customize the theme well beyond recognition. See tips and tricks below for more details.
  • LAMP – check. All written in PHP and uses a MySQL database perfectly. Even if I at times have hacked the WordPress core files it is not really necessary, but the key thing is that it is possible.

Pages and Custom Fields are your friends
Content in WordPress can be either shown in “posts” or “pages”, where posts are blog posts and pages are static content. Under the surface it is all the same, both posts and pages are stored in the same tables, but they are still treated a bit differently. If you are using WordPress as a CMS then pages are definitly your friends. I have found that pages are a little easier to organise if you are not writing a blog. You can set a page as the front page of your site in the WordPress control panel. Go into settings -> reading and choose what page you want as front page.

Custom Fields are data fields that are not part of the core WordPress installation but that you define yourself. A custom field is attached to a post or a page and can be used to store any extra info that WordPress or plugins dont handle. For restillmexico.se I use it to store the URL or the images for each page, as well as the attribution of the images etc (a trick from WordPress Custom Fields: Adding Images To Posts).

Plugins to turn WordPress into a CMS
Even if a basic WordPress installation works fine as a CMS there are a few plugins that makes your life easier:

  • CMS-like admin menu –  all it does is make posts second to pages in the menus, ie puts pages before posts. If all you have on your site is pages then this plugin is a nice touch.
  • PageMash – drag and drop interface to order pages. Makes it very easy to change order as well as set parent/child relationship among pages. I especially use this one heavily to organise the hierarchy at restillmexico.se.
  • Page Excerpt – adds the excerpt functionality of posts to pages
  • Query Child Of – makes it possible to list the children of a page in your theme, great in combination with PageMash mentioned above
  • Admin drop down menu – it gives the control panel a drop down menu, which is quicker and easier to use than the standard control panel menu. Not really unique for using WordPress as a CMS, but I really like this plugin.

Tips and tricks
To hack a wordpress theme to work as well as a CMS there are a few tips and tricks. The first tip is to not hack at all, just use the great CMS2 theme. It already has a page focus and most of the organisation needed for a static web site. For the sites I have on WordPress I have either customized this theme or one of the themes that comes with the basic WordPress install.

When customizing a theme there are some WordPress template tags that will save you a lot of time:

  • wp_list_pages – allows you to list the pages you have in your site, pefect to create a menu with. You can include or exclude any pages you want.
  • query_posts – get the posts you want by using the parameters to this tag. Since pages and posts really are the same thing this tag can be used to list pages when wp_list_pages doesn’t do exactly what you need.
  • get_post_meta – get the value of a specific custom field for a page or a post, necessary to be able to get the most out of custom fields.

Summary
Wordpress is a very good option as a CMS, not just for blogs but for static sites as well. With the great choice in plugins and the open source code of WordPress you can get it to do almost anything you want without serious hacking. The huge installbase ensures that you can find help and resources if you need it.