Search

Quick Links

/join #collective
Browse source code
Download latest release
File a bugreport
Request a new feature
Tracker overview
Participate at the forums

Project Status

development: inactive
latest release: 0.1.4
latest stable: none

Project Details

source code: Open Source
license: GPL
programming language: Perl

Translations

select english translation

History

In October 2003 I created "IRC Web Search" which consisted of a simple parser (written in Perl) that could process Eggdrop logfiles and fill a MySQL database. The querying application was a simple web application (see figure 1) that basically just performed one single SELECT query. A cronjob made sure the database was kept up to date periodically. No optimizations were implemented so with every logfile added to the database the performance drastically detoriated (sequential searching). In January 2004 I submitted this project to SourceForge.net, but soon after the project was no longer in active development.

IRC Web Search
Figure 1. IRC Web Search

In September 2005 I started working on a simple statistics script (also written in Perl) that could generate a static HTML page (see figures 2 and 3) from the data in the aforementioned MySQL database. The statistics showed online activity per year per user and per year per channel using a calendar and some graphs. This script was nothing more than a Sunday afternoon experiment and was never released to the general public. It's name could have been "IRC Web Stats".

IRC Web Stats IRC Web Stats
Figures 2 and 3. IRC Web Stats

In October 2005 I wanted to improve the performance of IRC Web Search and choose to do so by implementing a wordlist approach. I don't have any benchmark results so it's all very subjective. But I would guess that at some point before the improvements were made queries took about 15 seconds to complete, and after the update results would be returned almost instantly. For the record, the table containing all the seperate rows from the logfiles holded 345,131 records and the computer used to run the query had an Intel Pentium 2 450 MHz CPU with 256 MB of RAM. The most expensive part of the request was the COUNT() query needed for paging the results.

Although I created a prototype in January 2006 for what is now known as the IRC Collective, it was not until late September 2006 that I actually started programming it.