Re: Difference between currency fieldType and float fieldType

2016-12-06 Thread Dorian Hoxha
Yeah, you'll have to do the conversion yourself (or something internal,
like the currencyField).

Think about it as datetimes. You store everything in utc (cents), but
display to each user in it's own timezone (different currency, or just from
cents to full dollars).

On Wed, Dec 7, 2016 at 8:23 AM, Zheng Lin Edwin Yeo 
wrote:

> But if I index $1234.56 as "123456", won't it affect the search or facet if
> I do a query directly to Solr?
>
> Say if I search for index with amount that is lesser that $2000, it will
> not match, unless when we do the search, we have to pass "20" to Solr?
>
> Regards,
> Edwin
>
>
> On 7 December 2016 at 07:44, Chris Hostetter 
> wrote:
>
> > : Thanks for your reply.
> > :
> > : That means the best fieldType to use for money is currencyField, and
> not
> > : any other fieldType?
> >
> > The primary use case for CurrencyField is when you want to do dynamic
> > currency fluctuations between multiple currency types at query time --
> but
> > to do that you either need to use the FileExchangeRateProvider and have
> > your owne backend system to update the exchange rates, or you have to
> have
> > an openexchangerates.org account, or implement some other provider (with
> > custom solr java code)
> >
> >
> > If you only care about a single type of currency -- for example, if all
> > you care about is is US Dollars -- then just use either
> > TrieIntField or TrieLongField and represent in the smallest possible
> > increment you need to measure -- for US Dollars this would be cents. ie:
> > $1234.56 would be put in your index as "123456"
> >
> >
> >
> > -Hoss
> > http://www.lucidworks.com/
> >
>


Re: Difference between currency fieldType and float fieldType

2016-12-06 Thread Zheng Lin Edwin Yeo
But if I index $1234.56 as "123456", won't it affect the search or facet if
I do a query directly to Solr?

Say if I search for index with amount that is lesser that $2000, it will
not match, unless when we do the search, we have to pass "20" to Solr?

Regards,
Edwin


On 7 December 2016 at 07:44, Chris Hostetter 
wrote:

> : Thanks for your reply.
> :
> : That means the best fieldType to use for money is currencyField, and not
> : any other fieldType?
>
> The primary use case for CurrencyField is when you want to do dynamic
> currency fluctuations between multiple currency types at query time -- but
> to do that you either need to use the FileExchangeRateProvider and have
> your owne backend system to update the exchange rates, or you have to have
> an openexchangerates.org account, or implement some other provider (with
> custom solr java code)
>
>
> If you only care about a single type of currency -- for example, if all
> you care about is is US Dollars -- then just use either
> TrieIntField or TrieLongField and represent in the smallest possible
> increment you need to measure -- for US Dollars this would be cents. ie:
> $1234.56 would be put in your index as "123456"
>
>
>
> -Hoss
> http://www.lucidworks.com/
>


Re: Solr node not found in ZK live_nodes

2016-12-06 Thread Manohar Sripada
Thanks Erick! Should I create a JIRA issue for the same?

Regarding the logs, I have changed the log level to WARN. That may be the
reason, I couldn't get anything from it.

Thanks,
Manohar

On Tue, Dec 6, 2016 at 9:58 PM, Erick Erickson 
wrote:

> Most likely reason is that the Solr node in question,
> was not reachable thus it was removed from
> live_nodes. Perhaps due to temporary network
> glitch, long GC pause or the like. If you're rolling
> your logs over it's quite possible that any illuminating
> messages were lost. The default 4M size for each
> log is quite lo at INFO level...
>
> It does seem possible for a Solr node to periodically
> check its status and re-insert itself into live_nodes,
> go through recovery and all that. So far most of that
> registration logic is baked into startup code. What
> do others think? Worth a JIRA?
>
> Erick
>
> On Tue, Dec 6, 2016 at 3:53 AM, Manohar Sripada 
> wrote:
> > We have a 16 node cluster of Solr (5.2.1) and 5 node Zookeeper (3.4.6).
> >
> > All the Solr nodes were registered to Zookeeper (ls /live_nodes) when
> setup
> > was done 3 months back. Suddenly, few days back our search started
> failing
> > because one of the solr node(consider s16) was not seen in Zookeeper,
> i.e.,
> > when we checked for *"ls /live_nodes"*, *s16 *solr node was not found.
> > However, the corresponding Solr process was up and running.
> >
> > To my surprise, I couldn't find any errors or warnings in solr or
> zookeeper
> > logs related to this. I have few questions -
> >
> > 1. Is there any reason why this registration to ZK was lost? I know logs
> > should provide some information, but, it didn't. Did anyone encountered
> > similar issue, if so, what can be the root cause?
> > 2. Shouldn't Solr be clever enough to detect that the registration to ZK
> > was lost (for some reason) and should try to re-register again?
> >
> > PS: The issue is resolved by restarting the Solr node. However, I am
> > curious to know why it happened in the first place.
> >
> > Thanks
>


Re: solr audit logging

2016-12-06 Thread Jeff Courtade
They wanted out of the box solutions.

This is what I found too that it would be custom. i was hoping i just was
not finding something obvious.


Jeff Courtade
M: 240.507.6116

On Dec 6, 2016 7:07 PM, "John Bickerstaff"  wrote:

> You know - if I had to build this, I would consider slurping up the
> relevant log entries (if they exist) and feeding them to Kafka - then your
> people who want to analyze what happened can get those entries again and
> again (Think of Kafka kind of like a persistent messaging store that can
> store log entries or anything you want...)
>
> Of course, how much work you'd have to put into that depends on the
> technical skill of whoever is going to consume this stuff...
>
> Also, a plain old relational database can easily hold these things as well
> -and the code to parse the log messages into some simple tables wouldn't be
> that difficult...  There are probably existing examples / projects...
>
> Naturally - if the standard log entries do NOT get you what you need, then
> it gets to be more of an effort, although adding an extension to Solr isn't
> too hard once you understand the process...
>
> Ping back and let us know what you find in the logs and if you want more
> "advice" -- which you should always take with a grain of salt...
>
> On Tue, Dec 6, 2016 at 3:56 PM, John Bickerstaff  >
> wrote:
>
> > If you can identify currently-logged messages that give you what you need
> > (even if you have to modify or process them afterwards) you can easily
> make
> > a custom log4j config that grabs ONLY what you want and dumps it into a
> > separate file...
> >
> > I'm pretty sure I've seen all the request coming through in my SOLR log
> > files...
> >
> > In case that helps...
> >
> > On Tue, Dec 6, 2016 at 2:08 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> There is also Jetty level access log which shows the requests, though
> >> it may not show the HTTP PUT bodies.
> >>
> >> Finally, various online monitoring services probably have agents that
> >> integrate with Solr to show what's happening. Usually costs money
> >> though.
> >>
> >> Regards,
> >> Alex.
> >> 
> >> http://www.solr-start.com/ - Resources for Solr users, new and
> >> experienced
> >>
> >>
> >> On 6 December 2016 at 14:34, Jeff Courtade 
> >> wrote:
> >> > Thanks very much the trace idea is a brilliant way to dig into it. Did
> >> not
> >> > occur to me.
> >> >
> >> > I had another coworker suggest the custom
> >> >
> >> > http://lucene.apache.org/solr/6_3_0/solr-core/org/apache/sol
> >> r/update/processor/LogUpdateProcessorFactory.html
> >> >
> >> >
> >> > this is beyond my litmited abilites.
> >> >
> >> >
> >> > I will see what we can dig up out of the logs...
> >> >
> >> >
> >> > the original request was this...
> >> >
> >> >
> >> > "Is there any configuration, plugin, or application that will create
> an
> >> > audit trail for Solr requests? We have teams that would like to be
> able
> >> to
> >> > pull back changes/requests to documents in solr given a time period.
> The
> >> > information they would like to retrieve is the request to solr, where
> it
> >> > came from, and what the request did."
> >> >
> >> >
> >> > I am starting to think there is not a simple solution to this. I was
> >> hoping
> >> > there was an UpdateAudit class or something I could flip a switch on
> or
> >> > some such...
> >> >
> >> >
> >> >
> >> > On Tue, Dec 6, 2016 at 2:20 PM, Alexandre Rafalovitch <
> >> arafa...@gmail.com>
> >> > wrote:
> >> >
> >> >> You could turn the trace mode for everything in the Admin UI (under
> >> >> logs/levels) and see if any of the existing information is sufficient
> >> >> for your needs. If yes, then you change log level in the
> configuration
> >> >> just for that class/element.
> >> >>
> >> >> Alternatively, you could do a custom UpdateRequestProcessor in the
> >> >> request handler(s) that deal with update. Or perhaps
> >> >> LogUpdateProcessor (that's in every standard chain) is sufficient:
> >> >> http://www.solr-start.com/javadoc/solr-lucene/org/
> >> >> apache/solr/update/processor/LogUpdateProcessorFactory.html
> >> >>
> >> >> But it is also possible that the audit.log is something that has a
> >> >> specific format that other tools use. So, you could start from asking
> >> >> how that file would be used and then working backwards into Solr.
> >> >> Which would most likely be a custom URP, as I mentioned earlier.
> >> >>
> >> >> Regards,
> >> >>Alex.
> >> >> P.s. Remember that there are full document updates and partial
> >> >> updates. What you want to log about that is your business level
> >> >> decision.
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks,
> >> >
> >> > Jeff Courtade
> >> > M: 240.507.6116
> >>
> >
> >
>


Re: solr audit logging

2016-12-06 Thread John Bickerstaff
You know - if I had to build this, I would consider slurping up the
relevant log entries (if they exist) and feeding them to Kafka - then your
people who want to analyze what happened can get those entries again and
again (Think of Kafka kind of like a persistent messaging store that can
store log entries or anything you want...)

Of course, how much work you'd have to put into that depends on the
technical skill of whoever is going to consume this stuff...

Also, a plain old relational database can easily hold these things as well
-and the code to parse the log messages into some simple tables wouldn't be
that difficult...  There are probably existing examples / projects...

Naturally - if the standard log entries do NOT get you what you need, then
it gets to be more of an effort, although adding an extension to Solr isn't
too hard once you understand the process...

Ping back and let us know what you find in the logs and if you want more
"advice" -- which you should always take with a grain of salt...

On Tue, Dec 6, 2016 at 3:56 PM, John Bickerstaff 
wrote:

> If you can identify currently-logged messages that give you what you need
> (even if you have to modify or process them afterwards) you can easily make
> a custom log4j config that grabs ONLY what you want and dumps it into a
> separate file...
>
> I'm pretty sure I've seen all the request coming through in my SOLR log
> files...
>
> In case that helps...
>
> On Tue, Dec 6, 2016 at 2:08 PM, Alexandre Rafalovitch 
> wrote:
>
>> There is also Jetty level access log which shows the requests, though
>> it may not show the HTTP PUT bodies.
>>
>> Finally, various online monitoring services probably have agents that
>> integrate with Solr to show what's happening. Usually costs money
>> though.
>>
>> Regards,
>> Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>>
>>
>> On 6 December 2016 at 14:34, Jeff Courtade 
>> wrote:
>> > Thanks very much the trace idea is a brilliant way to dig into it. Did
>> not
>> > occur to me.
>> >
>> > I had another coworker suggest the custom
>> >
>> > http://lucene.apache.org/solr/6_3_0/solr-core/org/apache/sol
>> r/update/processor/LogUpdateProcessorFactory.html
>> >
>> >
>> > this is beyond my litmited abilites.
>> >
>> >
>> > I will see what we can dig up out of the logs...
>> >
>> >
>> > the original request was this...
>> >
>> >
>> > "Is there any configuration, plugin, or application that will create an
>> > audit trail for Solr requests? We have teams that would like to be able
>> to
>> > pull back changes/requests to documents in solr given a time period. The
>> > information they would like to retrieve is the request to solr, where it
>> > came from, and what the request did."
>> >
>> >
>> > I am starting to think there is not a simple solution to this. I was
>> hoping
>> > there was an UpdateAudit class or something I could flip a switch on or
>> > some such...
>> >
>> >
>> >
>> > On Tue, Dec 6, 2016 at 2:20 PM, Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >
>> >> You could turn the trace mode for everything in the Admin UI (under
>> >> logs/levels) and see if any of the existing information is sufficient
>> >> for your needs. If yes, then you change log level in the configuration
>> >> just for that class/element.
>> >>
>> >> Alternatively, you could do a custom UpdateRequestProcessor in the
>> >> request handler(s) that deal with update. Or perhaps
>> >> LogUpdateProcessor (that's in every standard chain) is sufficient:
>> >> http://www.solr-start.com/javadoc/solr-lucene/org/
>> >> apache/solr/update/processor/LogUpdateProcessorFactory.html
>> >>
>> >> But it is also possible that the audit.log is something that has a
>> >> specific format that other tools use. So, you could start from asking
>> >> how that file would be used and then working backwards into Solr.
>> >> Which would most likely be a custom URP, as I mentioned earlier.
>> >>
>> >> Regards,
>> >>Alex.
>> >> P.s. Remember that there are full document updates and partial
>> >> updates. What you want to log about that is your business level
>> >> decision.
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks,
>> >
>> > Jeff Courtade
>> > M: 240.507.6116
>>
>
>


Re: Difference between currency fieldType and float fieldType

2016-12-06 Thread Chris Hostetter
: Thanks for your reply.
: 
: That means the best fieldType to use for money is currencyField, and not
: any other fieldType?

The primary use case for CurrencyField is when you want to do dynamic 
currency fluctuations between multiple currency types at query time -- but 
to do that you either need to use the FileExchangeRateProvider and have 
your owne backend system to update the exchange rates, or you have to have 
an openexchangerates.org account, or implement some other provider (with 
custom solr java code)


If you only care about a single type of currency -- for example, if all 
you care about is is US Dollars -- then just use either 
TrieIntField or TrieLongField and represent in the smallest possible 
increment you need to measure -- for US Dollars this would be cents. ie: 
$1234.56 would be put in your index as "123456"



-Hoss
http://www.lucidworks.com/


Re: Difference between currency fieldType and float fieldType

2016-12-06 Thread Zheng Lin Edwin Yeo
Thanks for your reply.

That means the best fieldType to use for money is currencyField, and not
any other fieldType?

Regards,
Edwin

On 6 December 2016 at 21:33, Dorian Hoxha  wrote:

> Don't use float for money (in whatever db).
> https://wiki.apache.org/solr/CurrencyField
> What you do is save the money as cents, and store that in a long. That's
> what the currencyField probably does for you inside.
> It provides currency conversion at query-time.
>
>
> On Tue, Dec 6, 2016 at 4:45 AM, Zheng Lin Edwin Yeo 
> wrote:
>
> > Hi,
> >
> > Would like to understand better between the currency fieldType and float
> > fieldType.
> >
> > If I were to index a field that is a currency field by nature (Eg:
> amount)
> > into Solr, is it better to use the currency fieldType as compared to the
> > float fieldType?
> >
> > I found that for the float fieldType, if the amount is very big, the last
> > decimal place may get cut off in the index. For example, if the amount in
> > the original document is 800212.64, the number that is indexed in Solr is
> > 800212.6.
> >
> > Although by using the currency fieldType will solve this issue, but
> however
> > I found that I am not able to do faceting on currency fieldType. I will
> > need to have the facet so that I can list out the various amount that are
> > available based on the search criteria.
> >
> > As such, will like to seek your recommendation to determine which
> fieldType
> > is best for my needs.
> >
> > I'm using Solr 6.2.1
> >
> > Regards,
> > Edwin
> >
>


Re: solr audit logging

2016-12-06 Thread John Bickerstaff
If you can identify currently-logged messages that give you what you need
(even if you have to modify or process them afterwards) you can easily make
a custom log4j config that grabs ONLY what you want and dumps it into a
separate file...

I'm pretty sure I've seen all the request coming through in my SOLR log
files...

In case that helps...

On Tue, Dec 6, 2016 at 2:08 PM, Alexandre Rafalovitch 
wrote:

> There is also Jetty level access log which shows the requests, though
> it may not show the HTTP PUT bodies.
>
> Finally, various online monitoring services probably have agents that
> integrate with Solr to show what's happening. Usually costs money
> though.
>
> Regards,
> Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 6 December 2016 at 14:34, Jeff Courtade  wrote:
> > Thanks very much the trace idea is a brilliant way to dig into it. Did
> not
> > occur to me.
> >
> > I had another coworker suggest the custom
> >
> > http://lucene.apache.org/solr/6_3_0/solr-core/org/apache/
> solr/update/processor/LogUpdateProcessorFactory.html
> >
> >
> > this is beyond my litmited abilites.
> >
> >
> > I will see what we can dig up out of the logs...
> >
> >
> > the original request was this...
> >
> >
> > "Is there any configuration, plugin, or application that will create an
> > audit trail for Solr requests? We have teams that would like to be able
> to
> > pull back changes/requests to documents in solr given a time period. The
> > information they would like to retrieve is the request to solr, where it
> > came from, and what the request did."
> >
> >
> > I am starting to think there is not a simple solution to this. I was
> hoping
> > there was an UpdateAudit class or something I could flip a switch on or
> > some such...
> >
> >
> >
> > On Tue, Dec 6, 2016 at 2:20 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> You could turn the trace mode for everything in the Admin UI (under
> >> logs/levels) and see if any of the existing information is sufficient
> >> for your needs. If yes, then you change log level in the configuration
> >> just for that class/element.
> >>
> >> Alternatively, you could do a custom UpdateRequestProcessor in the
> >> request handler(s) that deal with update. Or perhaps
> >> LogUpdateProcessor (that's in every standard chain) is sufficient:
> >> http://www.solr-start.com/javadoc/solr-lucene/org/
> >> apache/solr/update/processor/LogUpdateProcessorFactory.html
> >>
> >> But it is also possible that the audit.log is something that has a
> >> specific format that other tools use. So, you could start from asking
> >> how that file would be used and then working backwards into Solr.
> >> Which would most likely be a custom URP, as I mentioned earlier.
> >>
> >> Regards,
> >>Alex.
> >> P.s. Remember that there are full document updates and partial
> >> updates. What you want to log about that is your business level
> >> decision.
> >>
> >
> >
> >
> > --
> > Thanks,
> >
> > Jeff Courtade
> > M: 240.507.6116
>


Re: IndexWriter exception

2016-12-06 Thread Erick Erickson
bq: maxWarmingSearchers is set to 6

Red flag ref. If this was done to avoid the warning in the logs about
too many warming searchers, it's a clear indication that you're
committing far too often. Let's see exactly what you're using to post
when you say you're "using the REST API". My bet: each one does a
commit. If this is the post.jar tool there's an option to _not_ commit
and I'd be sure to set that and let your autocommit settings handle
committing.

My guess is that you are adding more and more documents to Solr, thus
making it more likely that you are opening a bunch of searchers at
once (see above) and running into a race condition. So the first thing
I'd do is straighten that out and see if the problem goes away.

Best,
Erick

On Tue, Dec 6, 2016 at 1:34 PM, Alexandre Drouin
 wrote:
> Hello,
>
> I have an error that has been popping up randomly since 3 weeks ago and the 
> randomness of the issue makes it hard to troubleshoot.
>
> I have a service that use the REST API to index documents (1000 docs at a 
> time) and in this process I often call the core status API 
> (/solr/admin/cores?action=STATUS) to get the statuses of the different cores. 
>  This process has been working flawlessly since 2014 however it has been 
> failing recently with the exception: " this IndexWriter is closed".
>
> I did a few search on Google for this exception but I did not see anything 
> relevant.  Does anyone have an idea how to troubleshoot/fix this issue?
>
> This is my configuration:
> - Solr 4.10.2 on Windows.  I am not using SolrCloud.
> - Java 1.7.0_79 24.79-b02
> - useColdSearcher is set to true
> - maxWarmingSearchers is set to 6
> - I changed my Solr configuration about 2-3 months ago: I disabled HTTPS and 
> enabled the logging (INFO level) but I do not think this could cause the 
> issue.
>
> Relevant stack trace:
>
> INFO  - 2016-12-06 18:43:23.854; org.apache.solr.update.CommitTracker; Hard 
> AutoCommit: if uncommited for 9ms; if 75000 uncommited docs
> INFO  - 2016-12-06 18:43:23.856; org.apache.solr.update.CommitTracker; Soft 
> AutoCommit: if uncommited for 15000ms;
> INFO  - 2016-12-06 18:43:23.929; 
> org.apache.solr.update.processor.LogUpdateProcessor; [coreENCA] webapp=/solr 
> path=/update params={commit=false} {add=[Global_44235 (1552993270510911488), 
> Global_44236Pony (1552993270516154368), Global_44236Magnum 
> (1552993270518251520), Global_44237Pony (1552993270519300096), 
> Global_44237Split (1552993270521397249), Global_44237Standard 
> (1552993270523494401), Global_44238Pony (1552993270525591553), 
> Global_44238Standard (1552993270527688704), Global_44238Magnum 
> (1552993270529785856), Global_44239Standard (1552993270531883008), ... (2102 
> adds)]} 0 8292
> INFO  - 2016-12-06 18:43:23.933; org.apache.solr.core.SolrCore; [coreENCA]  
> CLOSING SolrCore org.apache.solr.core.SolrCore@5730eaaf
> INFO  - 2016-12-06 18:43:23.935; org.apache.solr.update.DirectUpdateHandler2; 
> closing DirectUpdateHandler2{commits=0,autocommit maxDocs=75000,autocommit 
> maxTime=9ms,autocommits=0,soft autocommit maxTime=15000ms,soft 
> autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=4176,adds=4176,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=4176,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0,transaction_logs_total_size=84547858,transaction_logs_total_number=1}
> INFO  - 2016-12-06 18:43:23.936; org.apache.solr.core.SolrCore; [coreENCA] 
> Closing main searcher on request.
> INFO  - 2016-12-06 18:43:24.044; org.apache.solr.search.SolrIndexSearcher; 
> Opening Searcher@73a40fbb[coreENCA] main
> ERROR - 2016-12-06 18:43:24.045; org.apache.solr.common.SolrException; 
> org.apache.solr.common.SolrException: Error handling 'status' action
> at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:710)
> at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:214)
> at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:522)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.ecl

IndexWriter exception

2016-12-06 Thread Alexandre Drouin
Hello,

I have an error that has been popping up randomly since 3 weeks ago and the 
randomness of the issue makes it hard to troubleshoot. 

I have a service that use the REST API to index documents (1000 docs at a time) 
and in this process I often call the core status API 
(/solr/admin/cores?action=STATUS) to get the statuses of the different cores.  
This process has been working flawlessly since 2014 however it has been failing 
recently with the exception: " this IndexWriter is closed".  

I did a few search on Google for this exception but I did not see anything 
relevant.  Does anyone have an idea how to troubleshoot/fix this issue? 

This is my configuration:
- Solr 4.10.2 on Windows.  I am not using SolrCloud. 
- Java 1.7.0_79 24.79-b02
- useColdSearcher is set to true
- maxWarmingSearchers is set to 6
- I changed my Solr configuration about 2-3 months ago: I disabled HTTPS and 
enabled the logging (INFO level) but I do not think this could cause the issue.

Relevant stack trace:

INFO  - 2016-12-06 18:43:23.854; org.apache.solr.update.CommitTracker; Hard 
AutoCommit: if uncommited for 9ms; if 75000 uncommited docs 
INFO  - 2016-12-06 18:43:23.856; org.apache.solr.update.CommitTracker; Soft 
AutoCommit: if uncommited for 15000ms; 
INFO  - 2016-12-06 18:43:23.929; 
org.apache.solr.update.processor.LogUpdateProcessor; [coreENCA] webapp=/solr 
path=/update params={commit=false} {add=[Global_44235 (1552993270510911488), 
Global_44236Pony (1552993270516154368), Global_44236Magnum 
(1552993270518251520), Global_44237Pony (1552993270519300096), 
Global_44237Split (1552993270521397249), Global_44237Standard 
(1552993270523494401), Global_44238Pony (1552993270525591553), 
Global_44238Standard (1552993270527688704), Global_44238Magnum 
(1552993270529785856), Global_44239Standard (1552993270531883008), ... (2102 
adds)]} 0 8292
INFO  - 2016-12-06 18:43:23.933; org.apache.solr.core.SolrCore; [coreENCA]  
CLOSING SolrCore org.apache.solr.core.SolrCore@5730eaaf
INFO  - 2016-12-06 18:43:23.935; org.apache.solr.update.DirectUpdateHandler2; 
closing DirectUpdateHandler2{commits=0,autocommit maxDocs=75000,autocommit 
maxTime=9ms,autocommits=0,soft autocommit maxTime=15000ms,soft 
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=4176,adds=4176,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=4176,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0,transaction_logs_total_size=84547858,transaction_logs_total_number=1}
INFO  - 2016-12-06 18:43:23.936; org.apache.solr.core.SolrCore; [coreENCA] 
Closing main searcher on request.
INFO  - 2016-12-06 18:43:24.044; org.apache.solr.search.SolrIndexSearcher; 
Opening Searcher@73a40fbb[coreENCA] main
ERROR - 2016-12-06 18:43:24.045; org.apache.solr.common.SolrException; 
org.apache.solr.common.SolrException: Error handling 'status' action 
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:710)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:214)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:522)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jett

Re: How to use the StandardTokenizer with currency

2016-12-06 Thread Steve Rowe
Cool, thanks for letting us know (and sorry about the typo!)

--
Steve
www.lucidworks.com

> On Dec 6, 2016, at 4:15 PM, Vinay B,  wrote:
> 
> Yes, that works (apart from the typo in PatternReplaceCharFilterFactory)
> 
> Here is my config
> 
> 
>  positionIncrementGap="100" autoGeneratePhraseQueries="true">
>  
> mapping="mapping.txt"/>
> replacement="xxdollarxx"/>
>
> replacement="\$" replace="all"/>
> words="stopwords.txt" enablePositionIncrements="true"/>
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1" types="word-delim-types.txt" />
>
> 
>  
> mapping="mapping.txt"/>
> replacement="xxdollarxx"/>
>
> replacement="\$" replace="all"/>
> ignoreCase="true" expand="true"/>
> words="stopwords.txt" enablePositionIncrements="true"/>
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"  types="word-delim-types.txt" />
>
>  
> 
> 
> On Wed, Nov 30, 2016 at 2:08 PM, Steve Rowe  wrote:
> 
>> Hi Vinay,
>> 
>> You should be able to use a char filter to convert “$” characters into
>> something that will survive tokenization, and then a token filter to
>> convert it back.
>> 
>> Something like this (untested):
>> 
>>  
>>>pattern=“\$”
>>replacement=“__dollar__”/>
>>
>>http://stackoverflow.com/questions/40877567/using-
>> standardtokenizerfactory-with-currency
>>> 
>>> I'd like to maintain other aspects of the StandardTokenizer functionality
>>> but I'm wondering if to do what I want, the task boils down to be able to
>>> instruct the StandardTokenizer not to discard the $ symbol ? Or is there
>>> another way? I'm hoping that this is possible with configuration, rather
>>> than code changes.
>>> 
>>> Thanks
>> 
>> 



Re: How to use the StandardTokenizer with currency

2016-12-06 Thread Vinay B,
Yes, that works (apart from the typo in PatternReplaceCharFilterFactory)

Here is my config



  







 
  








  


On Wed, Nov 30, 2016 at 2:08 PM, Steve Rowe  wrote:

> Hi Vinay,
>
> You should be able to use a char filter to convert “$” characters into
> something that will survive tokenization, and then a token filter to
> convert it back.
>
> Something like this (untested):
>
>   
>  pattern=“\$”
> replacement=“__dollar__”/>
> 
> http://stackoverflow.com/questions/40877567/using-
> standardtokenizerfactory-with-currency
> >
> > I'd like to maintain other aspects of the StandardTokenizer functionality
> > but I'm wondering if to do what I want, the task boils down to be able to
> > instruct the StandardTokenizer not to discard the $ symbol ? Or is there
> > another way? I'm hoping that this is possible with configuration, rather
> > than code changes.
> >
> > Thanks
>
>


Re: solr audit logging

2016-12-06 Thread Alexandre Rafalovitch
There is also Jetty level access log which shows the requests, though
it may not show the HTTP PUT bodies.

Finally, various online monitoring services probably have agents that
integrate with Solr to show what's happening. Usually costs money
though.

Regards,
Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 6 December 2016 at 14:34, Jeff Courtade  wrote:
> Thanks very much the trace idea is a brilliant way to dig into it. Did not
> occur to me.
>
> I had another coworker suggest the custom
>
> http://lucene.apache.org/solr/6_3_0/solr-core/org/apache/solr/update/processor/LogUpdateProcessorFactory.html
>
>
> this is beyond my litmited abilites.
>
>
> I will see what we can dig up out of the logs...
>
>
> the original request was this...
>
>
> "Is there any configuration, plugin, or application that will create an
> audit trail for Solr requests? We have teams that would like to be able to
> pull back changes/requests to documents in solr given a time period. The
> information they would like to retrieve is the request to solr, where it
> came from, and what the request did."
>
>
> I am starting to think there is not a simple solution to this. I was hoping
> there was an UpdateAudit class or something I could flip a switch on or
> some such...
>
>
>
> On Tue, Dec 6, 2016 at 2:20 PM, Alexandre Rafalovitch 
> wrote:
>
>> You could turn the trace mode for everything in the Admin UI (under
>> logs/levels) and see if any of the existing information is sufficient
>> for your needs. If yes, then you change log level in the configuration
>> just for that class/element.
>>
>> Alternatively, you could do a custom UpdateRequestProcessor in the
>> request handler(s) that deal with update. Or perhaps
>> LogUpdateProcessor (that's in every standard chain) is sufficient:
>> http://www.solr-start.com/javadoc/solr-lucene/org/
>> apache/solr/update/processor/LogUpdateProcessorFactory.html
>>
>> But it is also possible that the audit.log is something that has a
>> specific format that other tools use. So, you could start from asking
>> how that file would be used and then working backwards into Solr.
>> Which would most likely be a custom URP, as I mentioned earlier.
>>
>> Regards,
>>Alex.
>> P.s. Remember that there are full document updates and partial
>> updates. What you want to log about that is your business level
>> decision.
>>
>
>
>
> --
> Thanks,
>
> Jeff Courtade
> M: 240.507.6116


CDCR Replication from one source to multiple targets

2016-12-06 Thread Webster Homer
We would like to load a collection and have it replicate out to multiple
clusters. For example we want a US cluster to be able to replicate to
Europe and Asia.

I tried to create two source cdcrRequestHandlers
/cdcr01 and /cdcr02 each differing by their target zookeepers

When the target handlers were both named /cdcr I saw this in the logs
2016-12-06 18:53:34.104 WARN
 (cdcr-replicator-80-thread-2-processing-n:stlpj1scld.sial.com:8983_solr) [
  ] o.a.s.h.CdcrReplicator Log reader for target sial-catalog-product is
not initialised, it will be ignored.


a lot of these messages

Nothing was replicated to either cluster
I turned off a target and it's source, and it started replicating

So can this be done somehow? The current system seems limited without that.

Thanks,
Webster

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: solr audit logging

2016-12-06 Thread Jeff Courtade
Thanks very much the trace idea is a brilliant way to dig into it. Did not
occur to me.

I had another coworker suggest the custom

http://lucene.apache.org/solr/6_3_0/solr-core/org/apache/solr/update/processor/LogUpdateProcessorFactory.html


this is beyond my litmited abilites.


I will see what we can dig up out of the logs...


the original request was this...


"Is there any configuration, plugin, or application that will create an
audit trail for Solr requests? We have teams that would like to be able to
pull back changes/requests to documents in solr given a time period. The
information they would like to retrieve is the request to solr, where it
came from, and what the request did."


I am starting to think there is not a simple solution to this. I was hoping
there was an UpdateAudit class or something I could flip a switch on or
some such...



On Tue, Dec 6, 2016 at 2:20 PM, Alexandre Rafalovitch 
wrote:

> You could turn the trace mode for everything in the Admin UI (under
> logs/levels) and see if any of the existing information is sufficient
> for your needs. If yes, then you change log level in the configuration
> just for that class/element.
>
> Alternatively, you could do a custom UpdateRequestProcessor in the
> request handler(s) that deal with update. Or perhaps
> LogUpdateProcessor (that's in every standard chain) is sufficient:
> http://www.solr-start.com/javadoc/solr-lucene/org/
> apache/solr/update/processor/LogUpdateProcessorFactory.html
>
> But it is also possible that the audit.log is something that has a
> specific format that other tools use. So, you could start from asking
> how that file would be used and then working backwards into Solr.
> Which would most likely be a custom URP, as I mentioned earlier.
>
> Regards,
>Alex.
> P.s. Remember that there are full document updates and partial
> updates. What you want to log about that is your business level
> decision.
>



--
Thanks,

Jeff Courtade
M: 240.507.6116


Re: solr audit logging

2016-12-06 Thread Alexandre Rafalovitch
You could turn the trace mode for everything in the Admin UI (under
logs/levels) and see if any of the existing information is sufficient
for your needs. If yes, then you change log level in the configuration
just for that class/element.

Alternatively, you could do a custom UpdateRequestProcessor in the
request handler(s) that deal with update. Or perhaps
LogUpdateProcessor (that's in every standard chain) is sufficient:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/LogUpdateProcessorFactory.html

But it is also possible that the audit.log is something that has a
specific format that other tools use. So, you could start from asking
how that file would be used and then working backwards into Solr.
Which would most likely be a custom URP, as I mentioned earlier.

Regards,
   Alex.
P.s. Remember that there are full document updates and partial
updates. What you want to log about that is your business level
decision.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 6 December 2016 at 13:55, Jeff Courtade  wrote:
> Hello,
>
> Could someone point me in the correct direction for this?
>
> I am being asked to setup an "audit.log" or audit trail for writes and
> changes to documents.
>
> i do not know where to begin something like this.
>
> I am guessing it is just configuration of log4j but that is about as far as
> i can go...
>
> maybe something to do with logging things through only the update handler
> or something?
>
> Anyone bearing a cluestick is welcome.
>
> --
> Thanks,
>
> Jeff Courtade
> M: 240.507.6116


solr audit logging

2016-12-06 Thread Jeff Courtade
Hello,

Could someone point me in the correct direction for this?

I am being asked to setup an "audit.log" or audit trail for writes and
changes to documents.

i do not know where to begin something like this.

I am guessing it is just configuration of log4j but that is about as far as
i can go...

maybe something to do with logging things through only the update handler
or something?

Anyone bearing a cluestick is welcome.

--
Thanks,

Jeff Courtade
M: 240.507.6116


Re: Solr node not found in ZK live_nodes

2016-12-06 Thread Erick Erickson
Most likely reason is that the Solr node in question,
was not reachable thus it was removed from
live_nodes. Perhaps due to temporary network
glitch, long GC pause or the like. If you're rolling
your logs over it's quite possible that any illuminating
messages were lost. The default 4M size for each
log is quite lo at INFO level...

It does seem possible for a Solr node to periodically
check its status and re-insert itself into live_nodes,
go through recovery and all that. So far most of that
registration logic is baked into startup code. What
do others think? Worth a JIRA?

Erick

On Tue, Dec 6, 2016 at 3:53 AM, Manohar Sripada  wrote:
> We have a 16 node cluster of Solr (5.2.1) and 5 node Zookeeper (3.4.6).
>
> All the Solr nodes were registered to Zookeeper (ls /live_nodes) when setup
> was done 3 months back. Suddenly, few days back our search started failing
> because one of the solr node(consider s16) was not seen in Zookeeper, i.e.,
> when we checked for *"ls /live_nodes"*, *s16 *solr node was not found.
> However, the corresponding Solr process was up and running.
>
> To my surprise, I couldn't find any errors or warnings in solr or zookeeper
> logs related to this. I have few questions -
>
> 1. Is there any reason why this registration to ZK was lost? I know logs
> should provide some information, but, it didn't. Did anyone encountered
> similar issue, if so, what can be the root cause?
> 2. Shouldn't Solr be clever enough to detect that the registration to ZK
> was lost (for some reason) and should try to re-register again?
>
> PS: The issue is resolved by restarting the Solr node. However, I am
> curious to know why it happened in the first place.
>
> Thanks


Re: Difference between currency fieldType and float fieldType

2016-12-06 Thread Dorian Hoxha
Don't use float for money (in whatever db).
https://wiki.apache.org/solr/CurrencyField
What you do is save the money as cents, and store that in a long. That's
what the currencyField probably does for you inside.
It provides currency conversion at query-time.


On Tue, Dec 6, 2016 at 4:45 AM, Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> Would like to understand better between the currency fieldType and float
> fieldType.
>
> If I were to index a field that is a currency field by nature (Eg: amount)
> into Solr, is it better to use the currency fieldType as compared to the
> float fieldType?
>
> I found that for the float fieldType, if the amount is very big, the last
> decimal place may get cut off in the index. For example, if the amount in
> the original document is 800212.64, the number that is indexed in Solr is
> 800212.6.
>
> Although by using the currency fieldType will solve this issue, but however
> I found that I am not able to do faceting on currency fieldType. I will
> need to have the facet so that I can list out the various amount that are
> available based on the search criteria.
>
> As such, will like to seek your recommendation to determine which fieldType
> is best for my needs.
>
> I'm using Solr 6.2.1
>
> Regards,
> Edwin
>


Re: Using DIH FileListEntityProcessor with SolrCloud

2016-12-06 Thread Tom Evans
On Fri, Dec 2, 2016 at 4:36 PM, Chris Rogers
 wrote:
> Hi all,
>
> A question regarding using the DIH FileListEntityProcessor with SolrCloud 
> (solr 6.3.0, zookeeper 3.4.8).
>
> I get that the config in SolrCloud lives on the Zookeeper node (a different 
> server from the solr nodes in my setup).
>
> With this in mind, where is the baseDir attribute in the 
> FileListEntityProcessor config relative to? I’m seeing the config in the Solr 
> GUI, and I’ve tried setting it as an absolute path on my Zookeeper server, 
> but this doesn’t seem to work… any ideas how this should be setup?
>
> My DIH config is below:
>
> 
>   
>   
> 
>  fileName=".*xml"
> newerThan="'NOW-5YEARS'"
> recursive="true"
> rootEntity="false"
> dataSource="null"
> baseDir="/home/bodl-zoo-svc/files/">
>
>   
>
>  forEach="/TEI" url="${f.fileAbsolutePath}" 
> transformer="RegexTransformer" >
>  xpath="/TEI/teiHeader/fileDesc/titleStmt/title"/>
>  xpath="/TEI/teiHeader/fileDesc/publicationStmt/publisher"/>
>  xpath="/TEI/teiHeader/fileDesc/sourceDesc/msDesc/msIdentifier/altIdentifier/idno"/>
>   
>
> 
>
>   
> 
>
>
> This same script worked as expected on a single solr node (i.e. not in 
> SolrCloud mode).
>
> Thanks,
> Chris
>

Hey Chris

We hit the same problem moving from non-cloud to cloud, we had a
collection that loaded its DIH config from various XML files listing
the DB queries to run. We wrote a simple DataSource plugin function to
load the config from Zookeeper instead of local disk to avoid having
to distribute those config files around the cluster.

https://issues.apache.org/jira/browse/SOLR-8557

Cheers

Tom


Solr node not found in ZK live_nodes

2016-12-06 Thread Manohar Sripada
We have a 16 node cluster of Solr (5.2.1) and 5 node Zookeeper (3.4.6).

All the Solr nodes were registered to Zookeeper (ls /live_nodes) when setup
was done 3 months back. Suddenly, few days back our search started failing
because one of the solr node(consider s16) was not seen in Zookeeper, i.e.,
when we checked for *"ls /live_nodes"*, *s16 *solr node was not found.
However, the corresponding Solr process was up and running.

To my surprise, I couldn't find any errors or warnings in solr or zookeeper
logs related to this. I have few questions -

1. Is there any reason why this registration to ZK was lost? I know logs
should provide some information, but, it didn't. Did anyone encountered
similar issue, if so, what can be the root cause?
2. Shouldn't Solr be clever enough to detect that the registration to ZK
was lost (for some reason) and should try to re-register again?

PS: The issue is resolved by restarting the Solr node. However, I am
curious to know why it happened in the first place.

Thanks


logstash config

2016-12-06 Thread Arkadi Colson

Hi

Has somebody a recent logstash config for parsing solr logs? I'm using 
version 6.3.0


Thanks!

BR
Arkadi


Re: Solr seems to reserve facet.limit results

2016-12-06 Thread Toke Eskildsen
On Mon, 2016-12-05 at 17:47 -0700, Chris Hostetter wrote:
> : One simple solution, in my case would be, now just thinking of it,
> : run the query with no facets and no rows, get the numFound, and set
> : that as facet.limit for the actual query.
> 
> ...that assumes that the number of facet constraints returned is
> limited by the total number of documents matching the query -- in
> general there is no such garuntee because of multivalued fields (or
> faceting on tokenized fields), so this type of approach isn't a good
> idea as a generalized solution

For simple String/Text faceting, which Markus seems to be using, the
number of repetitions of a term in a document does not matter: Each
term only counts at most once per document.


If there are any common case deviations from this, the preface to the
faceting documentation should be updated: "...along with numerical
counts of how many matching documents were found were each term".
https://cwiki.apache.org/confluence/display/solr/Faceting

- Toke Eskildsen, State and University Library, Denmark