Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.comwrote:

 On the second thought,

 If you are already managing to pass the value using the request
 parameters, what stops you from just having a servlet filter looking
 for that parameter and assigning it directly to the MDC context?

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
  I like the idea. No comments about implementation, leave it to others.
 
  But if it is done, maybe somebody very familiar with logging can also
  review Solr's current logging config. I suspect it is not optimized
  for troubleshooting at this point.
 
  Regards,
 Alex.
  Personal website: http://www.outerthoughts.com/
  Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency
 
 
  On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com
 wrote:
  We have some metadata -- e.g. a request UUID -- that we log to every log
  line using Log4J's MDC [1]. The UUID logging allows us to connect any
 log
  lines we have for a given request across servers. Sort of like Zipkin
 [2].
 
  Currently we're using EmbeddedSolrServer without sharding, so adding the
  UUID is fairly simple, since everything is in one process and one
 thread.
  But, we're testing a sharded HTTP implementation and running into some
  difficulties getting this data passed around in a way that lets us trace
  all log lines generated by a request to its UUID.
 



Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Alexandre Rafalovitch
So to rephrase:

Solr will barf at unknown parameters, so we cannot currently send them in
band.

And the out of band dies not work due to post body handling complexity.

You are proposing effectively a dynamic set with common prefix to stop the
complaints. Plus the code to propagate those params.

Is that a good general description? I am just wondering if this can be
matched to some other real issues as well.

Regards,
 Alex
On 07/04/2014 11:23 pm, Gregg Donovan gregg...@gmail.com wrote:

 That was my first attempt, but it's much trickier than I anticipated.

 A filter that calls HttpServletRequest#getParameter() before
 SolrDispatchFilter will trigger an exception  -- see
 getParameterIncompatibilityException [1] -- if the request is a POST. It
 seems that Solr depends on the configured per-core SolrRequestParser to
 properly parse the request parameters. A servlet filter that came before
 SolrDispatchFilter would need to fetch the correct SolrRequestParser for
 the requested core, parse the request, and reset the InputStream before
 pulling the data into the MDC. It also duplicates the work of request
 parsing. It's especially tricky if you want to remove the tracing
 parameters from the SolrParams and just have them in the MDC to avoid them
 being logged twice.


 [1]

 https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.com
 wrote:

  On the second thought,
 
  If you are already managing to pass the value using the request
  parameters, what stops you from just having a servlet filter looking
  for that parameter and assigning it directly to the MDC context?
 
  Regards,
 Alex.
  Personal website: http://www.outerthoughts.com/
  Current project: http://www.solr-start.com/ - Accelerating your Solr
  proficiency
 
 
  On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
  arafa...@gmail.com wrote:
   I like the idea. No comments about implementation, leave it to others.
  
   But if it is done, maybe somebody very familiar with logging can also
   review Solr's current logging config. I suspect it is not optimized
   for troubleshooting at this point.
  
   Regards,
  Alex.
   Personal website: http://www.outerthoughts.com/
   Current project: http://www.solr-start.com/ - Accelerating your Solr
  proficiency
  
  
   On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com
  wrote:
   We have some metadata -- e.g. a request UUID -- that we log to every
 log
   line using Log4J's MDC [1]. The UUID logging allows us to connect any
  log
   lines we have for a given request across servers. Sort of like Zipkin
  [2].
  
   Currently we're using EmbeddedSolrServer without sharding, so adding
 the
   UUID is fairly simple, since everything is in one process and one
  thread.
   But, we're testing a sharded HTTP implementation and running into some
   difficulties getting this data passed around in a way that lets us
 trace
   all log lines generated by a request to its UUID.
  
 



Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Michael Sokolov
I had to grapple with something like this problem when I wrote Lux's 
app-server.  I extended SolrDispatchFilter and handle parameter 
swizzling to keep everything nicey-nicey for Solr while being able to 
play games with parameters of my own.  Perhaps this will give you some 
ideas:


https://github.com/msokolov/lux/blob/master/src/main/java/lux/solr/LuxDispatchFilter.java

It's definitely hackish, but seems to get the job done - for me - it's 
not a reusable component, but might serve as an illustration of one way 
to handle the problem


-Mike

On 04/07/2014 12:23 PM, Gregg Donovan wrote:

That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.comwrote:


On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr
proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:

I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr

proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com

wrote:

We have some metadata -- e.g. a request UUID -- that we log to every log
line using Log4J's MDC [1]. The UUID logging allows us to connect any

log

lines we have for a given request across servers. Sort of like Zipkin

[2].

Currently we're using EmbeddedSolrServer without sharding, so adding the
UUID is fairly simple, since everything is in one process and one

thread.

But, we're testing a sharded HTTP implementation and running into some
difficulties getting this data passed around in a way that lets us trace
all log lines generated by a request to its UUID.





Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
Michael,

Thanks! Unfortunately, as we use POSTs, that approach would trigger the
getParameterIncompatibilityException call due to the Enumeration of
getParameterNames before SolrDispatchFilter has a chance to access the
InputStream.

I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
and attached our current patch.


On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov 
msoko...@safaribooksonline.com wrote:

 I had to grapple with something like this problem when I wrote Lux's
 app-server.  I extended SolrDispatchFilter and handle parameter swizzling
 to keep everything nicey-nicey for Solr while being able to play games with
 parameters of my own.  Perhaps this will give you some ideas:

 https://github.com/msokolov/lux/blob/master/src/main/java/
 lux/solr/LuxDispatchFilter.java

 It's definitely hackish, but seems to get the job done - for me - it's not
 a reusable component, but might serve as an illustration of one way to
 handle the problem

 -Mike


 On 04/07/2014 12:23 PM, Gregg Donovan wrote:

 That was my first attempt, but it's much trickier than I anticipated.

 A filter that calls HttpServletRequest#getParameter() before
 SolrDispatchFilter will trigger an exception  -- see
 getParameterIncompatibilityException [1] -- if the request is a POST. It
 seems that Solr depends on the configured per-core SolrRequestParser to
 properly parse the request parameters. A servlet filter that came before
 SolrDispatchFilter would need to fetch the correct SolrRequestParser for
 the requested core, parse the request, and reset the InputStream before
 pulling the data into the MDC. It also duplicates the work of request
 parsing. It's especially tricky if you want to remove the tracing
 parameters from the SolrParams and just have them in the MDC to avoid them
 being logged twice.


 [1]
 https://github.com/apache/lucene-solr/blob/trunk/solr/
 core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.com
 wrote:

  On the second thought,

 If you are already managing to pass the value using the request
 parameters, what stops you from just having a servlet filter looking
 for that parameter and assigning it directly to the MDC context?

 Regards,
 Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:

 I like the idea. No comments about implementation, leave it to others.

 But if it is done, maybe somebody very familiar with logging can also
 review Solr's current logging config. I suspect it is not optimized
 for troubleshooting at this point.

 Regards,
 Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr

 proficiency


 On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com

 wrote:

 We have some metadata -- e.g. a request UUID -- that we log to every log
 line using Log4J's MDC [1]. The UUID logging allows us to connect any

 log

 lines we have for a given request across servers. Sort of like Zipkin

 [2].

 Currently we're using EmbeddedSolrServer without sharding, so adding the
 UUID is fairly simple, since everything is in one process and one

 thread.

 But, we're testing a sharded HTTP implementation and running into some
 difficulties getting this data passed around in a way that lets us
 trace
 all log lines generated by a request to its UUID.





Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Michael Sokolov
Yes, I see.  SolrDispatchFilter is  - not really written with 
extensibility in mind.


-Mike

On 4/7/14 3:50 PM, Gregg Donovan wrote:

Michael,

Thanks! Unfortunately, as we use POSTs, that approach would trigger the
getParameterIncompatibilityException call due to the Enumeration of
getParameterNames before SolrDispatchFilter has a chance to access the
InputStream.

I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
and attached our current patch.


On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov 
msoko...@safaribooksonline.com wrote:


I had to grapple with something like this problem when I wrote Lux's
app-server.  I extended SolrDispatchFilter and handle parameter swizzling
to keep everything nicey-nicey for Solr while being able to play games with
parameters of my own.  Perhaps this will give you some ideas:

https://github.com/msokolov/lux/blob/master/src/main/java/
lux/solr/LuxDispatchFilter.java

It's definitely hackish, but seems to get the job done - for me - it's not
a reusable component, but might serve as an illustration of one way to
handle the problem

-Mike


On 04/07/2014 12:23 PM, Gregg Donovan wrote:


That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/
core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.com

wrote:

  On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
 Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr
proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:


I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
 Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr


proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com


wrote:


We have some metadata -- e.g. a request UUID -- that we log to every log

line using Log4J's MDC [1]. The UUID logging allows us to connect any


log
lines we have for a given request across servers. Sort of like Zipkin
[2].
Currently we're using EmbeddedSolrServer without sharding, so adding the

UUID is fairly simple, since everything is in one process and one


thread.
But, we're testing a sharded HTTP implementation and running into some

difficulties getting this data passed around in a way that lets us
trace
all log lines generated by a request to its UUID.






Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Steve Davids
I have had this exact same use case and we ended up just setting a header 
value, then in a Servlet Filter we read the header value and set the MDC 
property within the filter. By reading the header value it didn’t complain 
about reading the request before making it to the SolrDispatchFilter. We used 
the Jetty web defaults to jam this functionality at the beginning of the 
servlet processing chain without having to crack open the war.

-Steve

On Apr 7, 2014, at 8:01 PM, Michael Sokolov msoko...@safaribooksonline.com 
wrote:

 Yes, I see.  SolrDispatchFilter is  - not really written with extensibility 
 in mind.
 
 -Mike
 
 On 4/7/14 3:50 PM, Gregg Donovan wrote:
 Michael,
 
 Thanks! Unfortunately, as we use POSTs, that approach would trigger the
 getParameterIncompatibilityException call due to the Enumeration of
 getParameterNames before SolrDispatchFilter has a chance to access the
 InputStream.
 
 I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
 and attached our current patch.
 
 
 On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov 
 msoko...@safaribooksonline.com wrote:
 
 I had to grapple with something like this problem when I wrote Lux's
 app-server.  I extended SolrDispatchFilter and handle parameter swizzling
 to keep everything nicey-nicey for Solr while being able to play games with
 parameters of my own.  Perhaps this will give you some ideas:
 
 https://github.com/msokolov/lux/blob/master/src/main/java/
 lux/solr/LuxDispatchFilter.java
 
 It's definitely hackish, but seems to get the job done - for me - it's not
 a reusable component, but might serve as an illustration of one way to
 handle the problem
 
 -Mike
 
 
 On 04/07/2014 12:23 PM, Gregg Donovan wrote:
 
 That was my first attempt, but it's much trickier than I anticipated.
 
 A filter that calls HttpServletRequest#getParameter() before
 SolrDispatchFilter will trigger an exception  -- see
 getParameterIncompatibilityException [1] -- if the request is a POST. It
 seems that Solr depends on the configured per-core SolrRequestParser to
 properly parse the request parameters. A servlet filter that came before
 SolrDispatchFilter would need to fetch the correct SolrRequestParser for
 the requested core, parse the request, and reset the InputStream before
 pulling the data into the MDC. It also duplicates the work of request
 parsing. It's especially tricky if you want to remove the tracing
 parameters from the SolrParams and just have them in the MDC to avoid them
 being logged twice.
 
 
 [1]
 https://github.com/apache/lucene-solr/blob/trunk/solr/
 core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628
 
 
 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch arafa...@gmail.com
 wrote:
  On the second thought,
 If you are already managing to pass the value using the request
 parameters, what stops you from just having a servlet filter looking
 for that parameter and assigning it directly to the MDC context?
 
 Regards,
 Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency
 
 
 On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
 
 I like the idea. No comments about implementation, leave it to others.
 
 But if it is done, maybe somebody very familiar with logging can also
 review Solr's current logging config. I suspect it is not optimized
 for troubleshooting at this point.
 
 Regards,
 Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 
 proficiency
 
 On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com
 
 wrote:
 
 We have some metadata -- e.g. a request UUID -- that we log to every log
 line using Log4J's MDC [1]. The UUID logging allows us to connect any
 
 log
 lines we have for a given request across servers. Sort of like Zipkin
 [2].
 Currently we're using EmbeddedSolrServer without sharding, so adding the
 UUID is fairly simple, since everything is in one process and one
 
 thread.
 But, we're testing a sharded HTTP implementation and running into some
 difficulties getting this data passed around in a way that lets us
 trace
 all log lines generated by a request to its UUID.
 
 
 



Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-06 Thread Alexandre Rafalovitch
On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 I like the idea. No comments about implementation, leave it to others.

 But if it is done, maybe somebody very familiar with logging can also
 review Solr's current logging config. I suspect it is not optimized
 for troubleshooting at this point.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr 
 proficiency


 On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com wrote:
 We have some metadata -- e.g. a request UUID -- that we log to every log
 line using Log4J's MDC [1]. The UUID logging allows us to connect any log
 lines we have for a given request across servers. Sort of like Zipkin [2].

 Currently we're using EmbeddedSolrServer without sharding, so adding the
 UUID is fairly simple, since everything is in one process and one thread.
 But, we're testing a sharded HTTP implementation and running into some
 difficulties getting this data passed around in a way that lets us trace
 all log lines generated by a request to its UUID.



Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Gregg Donovan
We have some metadata -- e.g. a request UUID -- that we log to every log
line using Log4J's MDC [1]. The UUID logging allows us to connect any log
lines we have for a given request across servers. Sort of like Zipkin [2].

Currently we're using EmbeddedSolrServer without sharding, so adding the
UUID is fairly simple, since everything is in one process and one thread.
But, we're testing a sharded HTTP implementation and running into some
difficulties getting this data passed around in a way that lets us trace
all log lines generated by a request to its UUID.

The first thing I tried was to add the UUID by adding it to the SolrParams.
This achieves the goal of getting those values logged on the shards if a
request is successful, but we miss having those values in the MDC if there
are other log lines before the final log line. E.g. an Exception in a
custom component.

My current thought is that sending HTTP headers with diagnostic information
would be very useful. Those could be placed in the MDC even before handing
off to work to SolrDispatchFilter, so that any Solr problem will have the
proper logging.

I.e. every additional header added to a Solr request gets a Solr- prefix.
On the server, we look for those headers and add them to the SLF4J MDC[3].

Here's a patch [4] that does this that we're testing out. Is this a good
idea? Would anyone else find this useful? If so, I'll open a ticket.

--Gregg

[1] http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html
[2] http://twitter.github.io/zipkin/
[3] http://www.slf4j.org/api/org/slf4j/MDC.html
[4] https://gist.github.com/greggdonovan/9982327


Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Alexandre Rafalovitch
I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan gregg...@gmail.com wrote:
 We have some metadata -- e.g. a request UUID -- that we log to every log
 line using Log4J's MDC [1]. The UUID logging allows us to connect any log
 lines we have for a given request across servers. Sort of like Zipkin [2].

 Currently we're using EmbeddedSolrServer without sharding, so adding the
 UUID is fairly simple, since everything is in one process and one thread.
 But, we're testing a sharded HTTP implementation and running into some
 difficulties getting this data passed around in a way that lets us trace
 all log lines generated by a request to its UUID.