Hello list, i am using the pretty old solr 4.7 *sigh* release and i am currently in investigation of performance problems. The solr instance runs currently very expensive queries with huge results and i want to find the most promising queries for optimization.
I am currently using the solr logfiles and a simple tool (enhanced by me) to analyze the queries: https://github.com/scoopex/solr-loganalyzer Is it possible to modify the log4j appender to also log other query attributes like response/request size in bytes and number of resulted documents? #- File to log to and log format log4j.appender.file.File=${solr.log}/solr.log log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%-5p - %d{yyyy-MM-dd HH:mm:ss.SSS}; %C; %m\n log4j.appender.file.bufferedIO=true Is there a better way to create detailed query stats and to replay queries on a test system? I think about snooping on the ethernet interface of a client or on the server to gather libpcap data. Is there a chance to analyze captured data i format is i.e "wt=javabin&version=2" I do similar things for mysql to make non intrusive performance analytics using pt-query-digest (Percona Toolkit). This works like that on mysql: 1.) Capture data # Capture all data on port 3306 tcpdump -s 65535 -x -nn -q -tttt -i any port 3306 > mysql.tcp.txt # capure only 1/7 of the connection using a modulus of 7 on the source port if you have a very busy network connection tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' > mysql.tcp.txt 2.) Create statistics on a other system using the tcpdump file pt-query-digest --watch-server '127.0.0.1:3307' --limit 1100000 --type tcpdump mysql.tcp.txt If i can extract the streams of the connections - do you have a idea how to parse the binary data? (Can i use parts of the solr client?) Is there comparable tool out there? Regards Marc