Friday, 13 April 2012

Installing Graphite on CentOS - Part 3 - Posting some data to Graphite

In this final section I'll talk about some utilities out there which easily enable you to post data to Graphite. While not strictly CentOS related, it finishes off this series of posts on graphite quite nicely.

With your graphite installation up and running you should see you have two listening ports open - TCP 2003 and 2004. Carbon listens on these ports for incoming connections and translates the data into datapoints which it inserts into whisper files held under /opt/graphite/storage.

The format of the data is object value timestamp where timestamp is simply `date +%s`.

The object is simply how you want to structure your data in graphite - e.g. servers.hostname.mem would be a "folder" for memory stats on a particular host and may contain substats such as swap util, cache, buffers etc. Graphite doesn't care how you structure the data or what data you send it - its merely a presentation tool at the end of the day!

A basic way of writing some data into graphite could be through netcat - the following script run every minute by cron captures the temp of my graphics card and sends it to graphite running on my local system:

#!/bin/sh
CARBON_PORT=2003
CARBON_SERVER=localhost
temp=`cat /proc/acpi/thermal_zone/TZ00/temperature | awk '{print $2}'`
echo "servers.`hostname -s`.graphiccard.temp $temp  `date +%s`" | nc $CARBON_SERVER $CARBON_PORT


Which ends up looking like this in graphite:



Of course, writing some scripts like these isn't really scalable - both from an data collection perspective and managing those endpoints, and the fact your graphite system may be handling thousands of tcp connections per second.

Graphite provides a pickle interface on port 2004 - pickle simply allows you to batch lots of stats in one string and seperates them out on delivery, reducing your TCP overhead. You can further improve performance by using statsd which is a UDP listener which then forwards the data received onto carbon - this is very useful if you don't want the TCP connection to carbon interfering with the performance of your application writing the data - it literally fires and forgets!

Fortunately, there are plenty of tools out there which easily allow you to collect a whole raft of stats and send them to graphite - I'll mention two I use here but they are by no means the only ones out there:

Diamond is a python daemon that collects system metrics and publishes them to graphite. It works particularly well on CentOS systems. Its not limited to OS metrics either - it can be extended to pull in all sorts of stats including memcache, mysql etc.

I build diamond as an RPM to deploy - to build the RPM simply follow these steps:

1. On a system with build tools such as rpmbuild, check out the latest codebase using git, Use python setup.py bdist_rpm to create an initial SRPM.

2. Install that SRPM and update the SPEC file with the relevant sections and add an init script such as this one:

#!/bin/sh
#
# chkconfig: - 90 60
# pidfile: /var/run/diamond.pid
# config: /etc/diamond/diamond.conf
#
### BEGIN INIT INFO
# Provides: diamond
# Required-Start: $local_fs $remote_fs $network $named
# Required-Stop: $local_fs $remote_fs $network
# Short-Description: run diamond daemon
# Description: Diamond collects system stats and passes them to graphite
### END INIT INFO
 

# Source function library.
. /etc/rc.d/init.d/functions

RETVAL=0
prog="diamond"
user="root"
exec="/usr/bin/diamond"
program="/usr/bin/diamond"
lockfile=/var/lock/subsys/diamond
config=/etc/diamond/diamond.conf

start() {
    echo -n $"Starting $prog: "
    daemon --user=$user $program
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && touch $lockfile
}

stop() {
    echo -n $"Stopping $prog: "
    killproc $program
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && rm -f $lockfile
}

# See how we were called.
case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  restart)
        stop
        start
        ;;
  *)
        echo $"Usage: $prog {start|stop|restart}"
        RETVAL=2
esac


3. At this point, you'll be able to build a new RPM and SRPM

Diamond provides quite a bit of info on configuration on their wiki - the main config file is /etc/diamond/diamond.conf - its pretty self explanatory

The other tool I frequently use is jmxtrans - this allows you to pull in stats from JMX on running java processes such as tomcat and forward them onto graphite. The build process to create an RPM is very similar to that as I have done for diamond.

There's a lot more you can do with Graphite including configuring relays so you can duplicate your data in two places at once, setup carbon clusters and so on. The first hurdle of course is getting graphite up and running and hopefully these posts will have helped towards that. Its worth reading up on whisper - the file based database used to store stats by carbon and of course read by graphite - you can tune (and retune) the data storage including retention time and compacting datapoints - whisper databases are similar to RRD but a bit more flexible.

No comments: