Hello list,

i am using the pretty old solr 4.7 *sigh* release and i am currently in 
investigation of performance problems.
The solr instance runs currently very expensive queries with huge results and i 
want to find the most promising queries for optimization.

I am currently using the solr logfiles and a simple tool (enhanced by me) to 
analyze the queries: https://github.com/scoopex/solr-loganalyzer

Is it possible to modify the log4j appender to also log other query attributes 
like response/request size in bytes and number of resulted documents?

#- File to log to and log format
log4j.appender.file.File=${solr.log}/solr.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%-5p - %d{yyyy-MM-dd 
HH:mm:ss.SSS}; %C; %m\n
log4j.appender.file.bufferedIO=true

Is there a better way to create detailed query stats and to replay queries on a 
test system?

I think about snooping on the ethernet interface of a client or on the server 
to gather libpcap data. Is there a chance to analyze captured data i format is 
i.e "wt=javabin&version=2"
I do similar things for mysql to make non intrusive performance analytics using 
pt-query-digest (Percona Toolkit).

This works like that on mysql:

1.) Capture data
     # Capture all data on port 3306
     tcpdump -s 65535 -x -nn -q -tttt -i any port 3306 > mysql.tcp.txt
     # capure only 1/7 of the connection using a modulus of 7 on the source 
port if you have a very busy network connection
     tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and 
tcp[3] & 7 == 2' > mysql.tcp.txt

2.) Create statistics on a other system using the tcpdump file
     pt-query-digest  --watch-server '127.0.0.1:3307' --limit 1100000 --type 
tcpdump mysql.tcp.txt

If i can extract the streams of the connections - do you have a idea how to 
parse the binary data?
(Can i use parts of the solr client?)

Is there comparable tool out there?

Regards
Marc


Reply via email to