Archive for the ‘HOWTO’ Category
Installing SRILM on Ubuntu 11.10
I’ve recently upgraded to Ubuntu 11.10, which I had avoided due to the Unity fiasco, among other reports of general “bugginess.” 10.04 worked really well for me for years. One of the first jobs was to install SRILM. Previously, installation proceeded without incident. Not this time. Here is how I got it to install in /usr/share/srilm on a 64-bit architecture:
- mkdir /usr/share/srilm
- mv srilm.tgz /usr/share/srilm
- cd /usr/share/srilm
- tar xzf srilm.tgz
- sudo apt-get install tcl tcl-dev csh gawk
- In Makefile, uncomment the SRILM= parameter and point it to /usr/share/srilm (or your equivalent path)
- make NO_TCL=1 MACHINE_TYPE=i686-ubuntu World
- Add the following to your .bashrc
SRILM=/usr/share/srilm
export PATH=$PATH:$SRILM/bin:$SRILM/bin/i686-ubuntu
export MANPATH=$SRILM/man:$MANPATH
Now you should be able to run ‘make test’ successfully.
HOWTO: Basic Arabic Preprocessing for NLP
Raw Arabic text is difficult to process. Errant diacritics, strange unicode characters, and haphazard use whitespace are all common obstacles for even basic tasks. For statistical systems, cliticization and morphological variation can induce sparsity. As a result, sophisticated preprocessing techniques have been developed, the best of which are described in these three papers:
- Nizar Habash and Owen Rambow. 2005. Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In NAACL. [pdf]
- Nizar Habash and Fatiha Sadat. 2006. Arabic preprocessing schemes for statistical machine translation. In NAACL. [pdf]
- Mona Diab, Kadri Hacioglu and Daniel Jurafsky. 2007. Automatic processing of Modern Standard Arabic text. In Arabic Computational Morphology. [pdf]
- basic_ortho_norm.py — Simple orthographic normalization.
- run_mada — Run script for MADA+TOKAN 3.1 that performs morphological analysis and clitic segmentation (like the Penn Arabic Treebank).
Both of my 2010 conference papers used these scripts.
HOWTO: WordPress, 1and1, and MySQL
For several months now I have been unable to automatically update WordPress. Then, this afternoon, I found that I could no longer update the site manually. A quick glance at the 2.9.1 release notes revealed the problem:
Requires MySQL 4.1.2 or greater (old requirement was 4.0).
A few searches revealed that many other 1&1 users have encountered the same issue, which I resolved by making two administrative changes:
- Ensure that WordPress is running on php5 by adding a line to .htaccess in the WordPress root directory.
- Migrate the WordPress database to MySQL 5.0. I followed this set of instructions exactly.