elasticsearch: installation and configuration on Centos 6.2

The installation of elasticsearch (which is distributed RESTful, search engine built on top of Apache Lucene) on Centos, will be easier if it is in repositories. But it isn’t. And since the best way to install application on Centos is to use rpm packages we will need to build it. At the very beginning you need to install Java:

yum install java-1.6.0-openjdk.x86_64

To not write our own SPEC files we will use the one kindly written by Tavis Aitken. It could be downloaded from his github page. After unpacking copy SPEC file to your ~/rpmbuild/SPECS directory. Besides you need to copy config files to ~/rpmbuild/SOURCES.

cp tavisto-elasticsearch-rpms-c78de68/SPECS elasticsearch.spec rpmbuild/SPECS/
cp tavisto-elasticsearch-rpms-c78de68/SOURCES/* rpmbuild/SOURCES/

If you don’t have ~/rpmbuild directory you may need to install the packages rpm-build and rpmdevtools, and run rpmdev-setuptree command.

yum install rpm-build rpmdevtools
rpmdev-setuptree

I would recommend to use last version of elasticsearch (at the time of writing it’s 0.19.1) since there are a lot of bug fixes (for instance, this one). To do this you need to modify SPEC file to point to the last version:

Version: 0.19.1

Now we are ready to download the tarball:

spectool -g rpmbuild/SPECS/elasticsearch.spec
mv elasticsearch-0.19.1.tar.gz rpmbuild/SOURCES/elasticsearch-0.19.1.tar.gz

And finally build rpm and install it:

rpmbuild -bb rpmbuild/SPECS/elasticsearch.spec
rpm -ivh rpmbuild/RPMS/x86_64/elasticsearch-0.19.1-1.el6.x86_64.rpm

At this point we need to configure the application. Comment out all lines in /etc/sysconfig/elasticsearch. After that configure /etc/elasticsearch/elasticsearch.yml to fit your requirements. Here is a simple example:

cluster.name: elasticsearch
node.name: "Franz Kafka"
path.conf: /etc/elasticsearch
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.bind_host: _eth0:ipv4_
network.publish_host: _eth0:ipv4_
network.host: _eth0:ipv4_
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.multicast.enabled: false

Start elasticsearch and make sure it’s running:

service elasticsearch start
netstat -lpn | grep java

Enable auto-start:

chkconfig elasticsearch on

Note that for security reasons it would be a good idea to restrict access to elasticsearch with iptables.

Now you can test if it’s working. You may PUT a data:

curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{
"user": "kimchy",
"post_date": "2009-11-15T14:12:12",
"message": "You know, for Search"
}'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"2","_version":1}

And then try to GET it:

curl -XGET http://localhost:9200/twitter/tweet/2
{"_index":"twitter","_type":"tweet","_id":"2","_version":1,"exists":true, "_source" : {
"user": "kimchy",
"post_date": "2009-11-15T14:12:12",
"message": "You know, for Search"
}}

Seems it’s working. Let’s delete test data:

curl -XDELETE 'http://localhost:9200/twitter/tweet/1'

{"ok":true,"found":false,"_index":"twitter","_type":"tweet","_id":"1","_version":1}