Yesterday I came back from the Lucene EuroCon 2010, wich took place in Prague.
There have been many interesting talks there these days. Some of the slides are already on Slide Share. Can’t wait for the others to be uploaded.
I gave a talk on Thursday about our usage of Solr at Trovit. Covered an overview of our architecture, different of our 0ut 0f the box and custom features and some of the future lines we have in mind.
“Munching and Crunching: Lucene Index post-processing” was definitelly my favourite talk. Andrzej Bialecki covered topics I have never even thought about. Among other things there was a pretty complete explanation about index splitting, pruning and multi-tiered search.
People tends to think all data processing must be done during indexing time. Andrzej showed us that many good stuff can be done once the index is already built.
Yonik explained in an hour the main features that are coming with new Solr releases, “Solr 1.5 and Beyond”. Extended DisMax query parser, quick introduction to SolrCloud, Spatial Search, Realtime Time and Field Collapsing where covered.
Grant Ingersoll spoke about Lucene / Solr relevance: “Practical Relevance: Tips and Tricks for Understanding and Improving Search Quality”.
It was very interesting to hear about the most commonly used techincques to do relevance testing:
A/B test, log analysis, empirical tests, asking or using related projects as Open Relevance or TREC.
Mark Miller talked about SolrCloud. It promises to make life so much easier to Solr distributed installations admins.
There were really good topics in the MeetUp as well. “How We Scaled Solr to 3+ Billion Documents” by Jason Rutherglen was the one I was expecting the most. I always like to hear about big Solr deployments and Hadoop usage related to Lucene and Solr indexing. This one I think is the biggest I know.
So, these days have been really useful. Many new ideas, many stuff to test.

