Checking performance of plugins, queryParser, edismax, etc

2016-06-07 Thread Zheng Lin Edwin Yeo
Hi,

Would like to find out, is there a way to check the performance of
the queryParser and things like edismax in Solr?

I have tried on the debug=true, but it only show general information like
the time taken for query, highlight, etc.

"process":{
"time":6397.0,
"query":{
  "time":5938.0},
"facet":{
  "time":0.0},
"facet_module":{
  "time":39.0},
"mlt":{
  "time":0.0},
"highlight":{
  "time":386.0},
"stats":{
  "time":0.0},
"expand":{
  "time":0.0},
"debug":{
  "time":32.0}

I'm trying to find out what is causing the query to slowdown. I have
included things like SynonymExpandingExtendedDismaxQParserPlugin, and would
like to find out the time it takes to process the plugin and other things
like edismax?

I'm using Solr 6.0.1.

Regards,
Edwin


Re: Concern of large amount daily update

2016-06-07 Thread Erick Erickson
Atomic updates are really a full document index under the covers. What
happens is that the stored fields are all read from disk, your updates
overlain and the entire document is re-indexed. From Solr's
persepctive, this is probably actually _more_ work than just having
the document resent completely.

100K documents each _day_ is a pretty small update load actually.
Indexing Wiki docs on my laptop I can get 3-4 _thousand_ docs a
second.

I believe you should start with just re-indexing the entire documents
and go to more complex solutions if (and only if) that performs
poorly. My bet is that you'll wind up spending more time getting the
documents from your system of record and Solr will hardly notice the
indexing load.

Much depends on how you index of course. If you're sending docs for
the ExtractingRequestHandler to process you'll be putting the
extraction load on your servers, you might consider moving that
processing to a separate client.

Best,
Erick

On Mon, Jun 6, 2016 at 7:00 PM, scott.chu  wrote:
>
> We recently plan to replace a old-school lucene that has 50M docs with 
> Solrcloud but the daily update, according to the responsive colleague,  could 
> be around 100 thousands docs. Its data source is a bunch of mysql tables. 
> When implementing the updating workflow, what solud I do so that I can 
> maintain a fair amount of time when doing updating docs? Currently what I 
> have in mind are:
>
> 1. Use atomic update to avoid unnecessary full-doc update.
> 2. Run multple of my updating process where each update different range of 
> docs.
>
> Is there other things that I can do to help my issue? Is there any suggestion 
> or expereiences for preparing appropriate h/w, e.g. CPU or RAM?
>
> scott.chu,scott@udngroup.com
> 2016/6/7 (週二)


Re: Solr 6 fail to index images

2016-06-07 Thread Alexandre Rafalovitch
How do you map all those fields right now to a PointMap? Seems the
same issue there, so if you solved it for one, you solved it for
another by using a regex to match field names.

Specifically, RegexReplaceProcessorFactory inherits from
FieldMutatingUpdateProcessorFactory, which means it can match field
names with regex using fieldRegex parameter:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html

Regards,
   Alex

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 8 June 2016 at 08:07, Jeferson dos Anjos  wrote:
> The problem with this solution is that I need to know all possible fields
> that can generate this type of error. For example, TIFF images get
> different errors. = /
>
> 2016-06-06 15:13 GMT-03:00 Shawn Heisey :
>
>> On 6/6/2016 10:56 AM, Jeferson dos Anjos wrote:
>> > I'm trying to index images on SOLR, but I get the following error:
>> > ERROR: [doc=5b36cb2b78072e41] Error adding field
>> > 'media_black_point'='(0.012054443, 0.012496948, 0.010314941)' msg=For
>> > input string: "(0.012054443" It looks like it's a problem of field
>> > types, but these fields are extracted automatically. I'm forgetting
>> > some additional configuration?
>>
>> Looks like you're probably running into this, which was marked "Won't Fix":
>>
>> https://issues.apache.org/jira/browse/SOLR-8017
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Jeferson M. dos Anjos
> CEO do Packdocs
> ps.: Mantenha seus arquivos vivos com o Packdocs (www.packdocs.com)


Re: Solr 6 fail to index images

2016-06-07 Thread Jeferson dos Anjos
The problem with this solution is that I need to know all possible fields
that can generate this type of error. For example, TIFF images get
different errors. = /

2016-06-06 15:13 GMT-03:00 Shawn Heisey :

> On 6/6/2016 10:56 AM, Jeferson dos Anjos wrote:
> > I'm trying to index images on SOLR, but I get the following error:
> > ERROR: [doc=5b36cb2b78072e41] Error adding field
> > 'media_black_point'='(0.012054443, 0.012496948, 0.010314941)' msg=For
> > input string: "(0.012054443" It looks like it's a problem of field
> > types, but these fields are extracted automatically. I'm forgetting
> > some additional configuration?
>
> Looks like you're probably running into this, which was marked "Won't Fix":
>
> https://issues.apache.org/jira/browse/SOLR-8017
>
> Thanks,
> Shawn
>
>


-- 
Jeferson M. dos Anjos
CEO do Packdocs
ps.: Mantenha seus arquivos vivos com o Packdocs (www.packdocs.com)


Re: SolrCloud SolrNode stopping randomly for no reason

2016-06-07 Thread Shawn Heisey
On 6/7/2016 9:55 AM, Pablo Anzorena wrote:
> Sorry for the poor details, but I didn't post the log files because
> there was nothing out of the ordinary in the solr.log file, neither in
> the solr-8984-console.log, nor in solr_gc.log. What log do you want me
> to show you? solr.log.1 (which I think it should be the last one) for
> example? You need the tail or the head of the file? When I say
> "stopping for no reason" I mean the service is not running anymore,
> the process is finished. I tried killing it with kill -9 command and
> it does not log that, my first thought was that I restarted the
> standalone solr service which try to stop the service and if it can't
> it kills it doing SOLR_PROCESS_ID="`ps -eaf | grep -v "grep" | grep
> "start.jar";kill -9 ${SOLR_PROCESS_ID}. So sometimes it could kill
> solrcloud instead of standalone, but sometimes the datetime does not
> match. Another option is that it gives an outofmemoryerror and the oom
> script is killing the process, but again I saw nothing in the solr_gc.log.

I'm pretty sure that nothing would get logged in the gc log for an
OutOfMemoryError.  It might show up in solr.log (or one of the rotated
or renamed solr.log files), but depending on exactly what code throws
the OOME, it's also possible that the actual exception won't be logged
at all.

The bin/solr script on 5.2.1 uses the OOM killer option incorrectly --
so it doesn't even work.  If you fix the commandline to make it work,
then it would create a solr_oom_killer_STUFF logfile.

I would strongly recommend editing bin/solr to increase the "waiting to
start/die" timeout from 5 seconds to 30-60 seconds, especially if you
are running more than one Solr or Jetty process on the machine.  It
might also be a good idea to have an issue requesting a change in how
the script figures out which process gets the "kill -9" signal.

Thanks,
Shawn



Re: Field Definitions Ignored

2016-06-07 Thread Shawn Heisey
On 6/7/2016 12:37 PM, Alan G Quan wrote:
> Solr seems to ignore my field definitions in schema.xml. I have
> defined many fields, each using the standard syntax, e.g.,:  name="product_id" type="int" indexed="true" stored="true"/> These
> fields are defined but not yet populated with data. Solr reads the
> schema.xml with no problem on startup, and the core using that schema
> is created successfully, but none of the fields appear in the Schema
> Browser. Is this because the fields are not defined in Solr until they
> are initialized with actual document data?

I just checked the schema browser for a completely empty index.  It has
all the fields defined in that index.

I suspect that the schema you are changing is not the active schema that
is being used by the core you are looking at.

If you are running 5.5 or later, the schema will normally be read from a
file named managed-schema, not a file named schema.xml.

https://issues.apache.org/jira/browse/SOLR-8387

If you are running in cloud mode, the active schema and config are in
zookeeper.   Editing files on the disk will accomplish nothing, unless
you upload the changes to zookeeper and reload the collection or restart
all your Solr instances.

If the file you are editing is under the configsets directory, then it
is not the active schema for a core that already exists unless you are
using the explicit configsets feature, which most people do not do.

Thanks,
Shawn



Re: Index and Search file path

2016-06-07 Thread Shawn Heisey
On 6/7/2016 2:59 PM, sz wrote:
> Hi all, Recently I run into issue indexing and searching path value.
> For example, I have field with solr type "string". I tried to store a
> file path with value '\\root\path1\path2'. After successfully
> indexing, I checked via Solr Admin UI and see the value is
> 'root\\path1\\path2'. Apparently Solr/Lucense added the escape
> character '\' automatically to '\' in original value. And to search
> this value, I have to manually added '\' to my search term. I am
> wondering why Solr add it automatically during indexing and I have to
> add it manual during query. My issues with it is that even I added the
> escape '\' manually, it wont' work in join query (search in
> subdocuments).

The backslash character is the escape character in the syntax of a lot
of languages and data formats.

When you get JSON results, if there is a literal backslash, it shows up
as two backslashes -- one to say "the next character is escaped" and
then the escaped character.  When the data returned by a query (which is
stored internally with only one backslash) is run through standard JSON
encoding, the backslashes are escaped.  If you were to ask for XML
results, I think you'd find that there's only one backslash, because XML
works very differently than JSON.

Solr queries are the same way.  Characters that should be treated
literally are preceded by a backslash, so if you want an actual
backslash in your query, you have to put two of them, or run your query
through a function that will escape special characters, which is going
to add a backslash to each special character -- including existing
backslashes.

Thanks,
Shawn



Index and Search file path

2016-06-07 Thread sz
Hi all,

Recently I run into issue indexing and searching path value.

For example,  I have field with solr type "string".   I tried to store a
file path with value '\\root\path1\path2'.  After successfully indexing, I
checked via Solr Admin UI and see the value is 'root\\path1\\path2'. 
Apparently Solr/Lucense added the escape character '\' automatically to '\'
in original value.  And to search this value, I have to manually added '\'
to my search term.  

I am wondering why Solr add it automatically during indexing and I have to
add it manual during query.  My issues with it is that even I added the
escape '\' manually,  it wont' work in join query (search in subdocuments).  


I am aware Lucene/Solr has some special characters such as + - && || ! ( ) {
} [ ] ^ " ~ * ? : \ /
. But when I tried different characters,  Solr only adds escape character
'\' to special character '"' and '\'.  For the rest, the value in index is
as is without escape character '\'.
  
Here is my testing value 
   (a^b:c\d~e+ag"h\i*j-k[l?m|n}
In solr index, the value is
  (a^b:c\\d~e+ag\"h\\i*j-k[l?m|n}


Does anyone has the experience indexing and search value of file path?

Thanks,
SZ




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-and-Search-file-path-tp4281112.html
Sent from the Solr - User mailing list archive at Nabble.com.


start parameter for CloudSolrStream

2016-06-07 Thread Susmit Shukla
*sending with correct subject*

Does solr streaming aggregation support pagination?
Some documents seem to be skipped if I set "start" parameter for
CloudSolrStream for a sharded collection.

Thanks,
Susmit


Re: Field Definitions Ignored

2016-06-07 Thread Susmit Shukla
Does solr streaming aggregation support pagination?
Some documents seem to be skipped if I set "start" parameter for
CloudSolrStream for a sharded collection.

Thanks,
Susmit


Re: Solutions for Multi-word Synonyms

2016-06-07 Thread Joe Lawson
I'm sorry I wasn't more specific, I meant we were hijacking the thread with
the question, "Anyone used a different method of
handling multi-term synonyms that isn't as global?" as the original thread
was about getting synonym_edismax running.

On Tue, Jun 7, 2016 at 2:24 PM, MaryJo Sminkey  wrote:

> > MaryJo you might want to start a new thread, I think we kinda hijacked
> this
> > one. Also if you are interested in tuning queries check out
> > http://splainer.io/ and https://www.quepid.com which are interactive
> tools
> > (both of which my company makes) to tune for search relevancy.
> >
>
>
> Okay I changed the subject. But I don't need a tuning tool, I already know
> WHY I'm not getting the results I need, the problem is how to fix it or get
> around what the plugin is doing. Which is why I was inquiring if people
> have had success with something other than this particularly plugin for
> more advanced queries that it messes around with. It seems to do a good job
> if you aren't doing anything particularly complicated with your search
> logic, but I don't see a good way to solve the issue I'm having, and a
> tuning tool isn't really going to help with that. We were pretty happy with
> our search relevancy for the most part *other* than the problem with the
> multi-term synonyms not working reliably but I definitely can't lose
> relevancy that we had just to get those working.
>
> In reviewing your tools previously, the problem as I recall is that they
> rely on querying Solr directly, while our searches go through multiple
> levels of an application which includes a lot of additional logic in terms
> of what the data that gets sent to Solr are, so they just aren't going to
> be much use for us. It was easier for me to just write my own tool that
> essentially does the same kind of thing, but with my application logic
> built in.
>
> Mary Jo
>


Field Definitions Ignored

2016-06-07 Thread Alan G Quan
Solr seems to ignore my field definitions in schema.xml.  I have defined many 
fields, each using the standard syntax, e.g.,:

These fields are defined but not yet populated with data.  Solr reads the 
schema.xml with no problem on startup, and the core using that schema is 
created successfully, but none of the fields appear in the Schema Browser.  Is 
this because the fields are not defined in Solr until they are initialized with 
actual document data?

Thank you,
Alan



Solutions for Multi-word Synonyms

2016-06-07 Thread MaryJo Sminkey
> MaryJo you might want to start a new thread, I think we kinda hijacked this
> one. Also if you are interested in tuning queries check out
> http://splainer.io/ and https://www.quepid.com which are interactive tools
> (both of which my company makes) to tune for search relevancy.
>


Okay I changed the subject. But I don't need a tuning tool, I already know
WHY I'm not getting the results I need, the problem is how to fix it or get
around what the plugin is doing. Which is why I was inquiring if people
have had success with something other than this particularly plugin for
more advanced queries that it messes around with. It seems to do a good job
if you aren't doing anything particularly complicated with your search
logic, but I don't see a good way to solve the issue I'm having, and a
tuning tool isn't really going to help with that. We were pretty happy with
our search relevancy for the most part *other* than the problem with the
multi-term synonyms not working reliably but I definitely can't lose
relevancy that we had just to get those working.

In reviewing your tools previously, the problem as I recall is that they
rely on querying Solr directly, while our searches go through multiple
levels of an application which includes a lot of additional logic in terms
of what the data that gets sent to Solr are, so they just aren't going to
be much use for us. It was easier for me to just write my own tool that
essentially does the same kind of thing, but with my application logic
built in.

Mary Jo


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread Joe Lawson
MaryJo you might want to start a new thread, I think we kinda hijacked this
one. Also if you are interested in tuning queries check out
http://splainer.io/ and https://www.quepid.com which are interactive tools
(both of which my company makes) to tune for search relevancy.

On Tue, Jun 7, 2016 at 1:45 PM, MaryJo Sminkey  wrote:

> I'm really thinking this just might not be the right tool for us, what we
> really need is a solution that works like the normal synonym filter does,
> just with proper multi-term support, so I can apply the synonyms only on
> certain fields (copied fields) that have their own, lower boost settings.
> The way this plugin works across the entire query just seems too
> problematic when you need to do complex queries with lots of different
> boost settings to get good relevancy. Anyone used a different method of
> handling multi-term synonyms that isn't as global?
>
> Mary Jo
>
>
>
> On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey 
> wrote:
>
> > Here's the issue I am still having with getting the right search
> relevancy
> > with the synonym plugin in place. We typically have users searching on
> > multiple terms, and we want matches across multiple terms, particularly
> > those that appears as phrases, to appear higher than matches for the same
> > term multiple times. The synonym filter makes this complicated since we
> may
> > have cases where the term the user enters, like "sbc", maps to a
> multi-term
> > synonym like "small block", and we always want the matches for the
> original
> > term to pop up first, so I'm trying to make sure the original boost is
> high
> > enough to override a phrase boost that the multi-term synonym would give.
> > Unfortunately this then means matches on the same term multiple times get
> > pushed up over my phrase matches...those aren't going to be the most
> > relevant matches. Not sure there's a way to solve this successfully,
> > without a completely different approach to the synonyms... or not
> counting
> > the number of matches on terms (I assume you can drop that ability,
> > although that's not ideal either...just better than what I have now).
> >
> > MJ
> >
> >
> >
> > Sent with MailTrack
> > <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
> >
> > On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
> > wrote:
> >
> >>
> >> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
> >> jlaw...@opensourceconnections.com> wrote:
> >>
> >>>
> >>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
> >>> were no match for the product name and keyword field boosts so that
> would
> >>> influence your search as well.
> >>
> >>
> >>
> >> Yeah I definitely will have to play with the values a bit as we want the
> >> product name matches to always appear highest, whether original or
> >> synonyms, but I'll have to figure out how to get that result without one
> >> word terms that have multi word synonyms getting overly boosted for a
> >> phrase match while still sufficiently boosting the normal phrase
> match
> >> stuff too. With the normal synonym filter I was able to just copy fields
> >> that could have synonyms to a new field (which would be the only one
> with
> >> the synonym filter), and use a different, lower boost on those fields,
> but
> >> that won't work with this plugin which applies across everything in the
> >> query. Makes it a bit more complicated to get everything just right.
> >>
> >> MJ
> >>
> >>
> >> Sent with MailTrack
> >> <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
> >>
> >
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread MaryJo Sminkey
I'm really thinking this just might not be the right tool for us, what we
really need is a solution that works like the normal synonym filter does,
just with proper multi-term support, so I can apply the synonyms only on
certain fields (copied fields) that have their own, lower boost settings.
The way this plugin works across the entire query just seems too
problematic when you need to do complex queries with lots of different
boost settings to get good relevancy. Anyone used a different method of
handling multi-term synonyms that isn't as global?

Mary Jo



On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey  wrote:

> Here's the issue I am still having with getting the right search relevancy
> with the synonym plugin in place. We typically have users searching on
> multiple terms, and we want matches across multiple terms, particularly
> those that appears as phrases, to appear higher than matches for the same
> term multiple times. The synonym filter makes this complicated since we may
> have cases where the term the user enters, like "sbc", maps to a multi-term
> synonym like "small block", and we always want the matches for the original
> term to pop up first, so I'm trying to make sure the original boost is high
> enough to override a phrase boost that the multi-term synonym would give.
> Unfortunately this then means matches on the same term multiple times get
> pushed up over my phrase matches...those aren't going to be the most
> relevant matches. Not sure there's a way to solve this successfully,
> without a completely different approach to the synonyms... or not counting
> the number of matches on terms (I assume you can drop that ability,
> although that's not ideal either...just better than what I have now).
>
> MJ
>
>
>
> Sent with MailTrack
> 
>
> On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
> wrote:
>
>>
>> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
>> jlaw...@opensourceconnections.com> wrote:
>>
>>>
>>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
>>> were no match for the product name and keyword field boosts so that would
>>> influence your search as well.
>>
>>
>>
>> Yeah I definitely will have to play with the values a bit as we want the
>> product name matches to always appear highest, whether original or
>> synonyms, but I'll have to figure out how to get that result without one
>> word terms that have multi word synonyms getting overly boosted for a
>> phrase match while still sufficiently boosting the normal phrase match
>> stuff too. With the normal synonym filter I was able to just copy fields
>> that could have synonyms to a new field (which would be the only one with
>> the synonym filter), and use a different, lower boost on those fields, but
>> that won't work with this plugin which applies across everything in the
>> query. Makes it a bit more complicated to get everything just right.
>>
>> MJ
>>
>>
>> Sent with MailTrack
>> 
>>
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread MaryJo Sminkey
Here's the issue I am still having with getting the right search relevancy
with the synonym plugin in place. We typically have users searching on
multiple terms, and we want matches across multiple terms, particularly
those that appears as phrases, to appear higher than matches for the same
term multiple times. The synonym filter makes this complicated since we may
have cases where the term the user enters, like "sbc", maps to a multi-term
synonym like "small block", and we always want the matches for the original
term to pop up first, so I'm trying to make sure the original boost is high
enough to override a phrase boost that the multi-term synonym would give.
Unfortunately this then means matches on the same term multiple times get
pushed up over my phrase matches...those aren't going to be the most
relevant matches. Not sure there's a way to solve this successfully,
without a completely different approach to the synonyms... or not counting
the number of matches on terms (I assume you can drop that ability,
although that's not ideal either...just better than what I have now).

MJ



Sent with MailTrack


On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey  wrote:

>
> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
> jlaw...@opensourceconnections.com> wrote:
>
>>
>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
>> were no match for the product name and keyword field boosts so that would
>> influence your search as well.
>
>
>
> Yeah I definitely will have to play with the values a bit as we want the
> product name matches to always appear highest, whether original or
> synonyms, but I'll have to figure out how to get that result without one
> word terms that have multi word synonyms getting overly boosted for a
> phrase match while still sufficiently boosting the normal phrase match
> stuff too. With the normal synonym filter I was able to just copy fields
> that could have synonyms to a new field (which would be the only one with
> the synonym filter), and use a different, lower boost on those fields, but
> that won't work with this plugin which applies across everything in the
> query. Makes it a bit more complicated to get everything just right.
>
> MJ
>
>
> Sent with MailTrack
> 
>


Re: Need Help with Solr 6.0 Cross Data Center Replication

2016-06-07 Thread Satvinder Singh
Hi,

Any updates on this??

Thanks

Satvinder Singh
 
 
   
  Security Systems Engineer
satvinder.si...@nc4.com
804.744.9630  x273 direct
703.989.8030 cell



www.NC4.com 
   
   
  
    
 










On 5/19/16, 8:41 AM, "Satvinder Singh"  wrote:

>Hi,
>
>So this is what I did:
>
>I created solr as a service. Below are the steps I followed for that:--
>
>$ tar xzf solr-X.Y.Z.tgz solr-X.Y.Z/bin/install_solr_service.sh 
>--strip-components=2
>
>$ sudo bash ./install_solr_service.sh solr-X.Y.Z.tgz -i /opt/solr1 -d 
>/var/solr1 -u solr -s solr1 -p 8501
>$ sudo bash ./install_solr_service.sh solr-X.Y.Z.tgz -i /opt/solr2 -d 
>/var/solr2 -u solr -s solr2 -p 8502
>
>Then to start it in cloud I modified the solr1.cmd.in and solr2.cmd.in in 
>/etc/defaults/
>I added ZK_HOST=192.168.56.103:2181,192.168.56.103:2182,192.168.56.103:2183 
>(192.168.56.103 is where my 3 zookeeper instances are)
>
>Then I started the 2 solr services solr1 and solr2
>
>Then I created the configset
>/bin/solr zk -upconfig -z 
>192.168.56.103:2181,192.168.56.103:2182,192.168.56.103:2183 -n Liferay -d 
>server/solr/configsets/sample_techproducts_configs/conf
>
>Then I created the collection using:
>http://192.168.56.101:8501/solr/admin/collections?action=CREATE=dingdong=1=2=liferay
>This created fine
>
>Then I deleted the solrconfig.xml from the zookeeper Liferay configset
>
>Then I uploaded the new solrconfig.xml to the configset. 
>
>When when I do a reload on the collections I get the error. Or I created a new 
>collection I get the error.
>
>Thanks
>
>Satvinder Singh
> 
> 
> 
>Security Systems Engineer
>satvinder.si...@nc4.com
>703.682.6000 x276 direct
>703.989.8030 cell
>www.NC4.com
> 
> 
>
>  
> 
>?
>
>
>-Original Message-
>From: Renaud Delbru [mailto:renaud@siren.solutions] 
>Sent: Thursday, May 19, 2016 7:13 AM
>To: solr-user@lucene.apache.org
>Subject: Re: Need Help with Solr 6.0 Cross Data Center Replication
>
>I have reproduced your steps and the cdcr request handler started 
>successfully. I have attached to this mail the config sets I have used. 
>It is simply the sample_techproducts_config configset with your solrconfig.xml.
>
>I have used solr 6.0.0 with the following commands:
>
>$ ./bin/solr start -cloud
>
>$ ./bin/solr create_collection -c test_cdcr -d cdcr_configs
>
>Connecting to ZooKeeper at localhost:9983 ...
>Uploading /solr-6.0.0/server/solr/configsets/cdcr_configs/conf for config 
>test_cdcr to ZooKeeper at localhost:9983
>
>Creating new collection 'test_cdcr' using command:
>http://localhost:8983/solr/admin/collections?action=CREATE=test_cdcr=1=1=1=test_cdcr
>
>{
>   "responseHeader":{
> "status":0,
> "QTime":5765},
>   "success":{"127.0.1.1:8983_solr":{
>   "responseHeader":{
> "status":0,
> "QTime":4426},
>   "core":"test_cdcr_shard1_replica1"}}}
>
>$ curl http://localhost:8983/solr/test_cdcr/cdcr?action=STATUS
>
>
>
>0name="QTime">3name="process">stoppedenabled 
>
>
>
>The difference is that I have used the embedded zookeeper, not a separate 
>ensemble.
>
>Could you please provide the commands you used to create the collection ?
>
>Kind Regards
>--
>Renaud Delbru
>
>On 16/05/16 16:55, Satvinder Singh wrote:
>> I also am using a zk ensemble with 3 nodes on each side.
>>
>> Thanks
>>
>>
>> Satvinder Singh
>>
>>
>>
>> Security Systems Engineer
>> satvinder.si...@nc4.com
>> 703.682.6000 x276 direct
>> 703.989.8030 cell
>> www.NC4.com
>>
>>
>>
>>
>>
>> ?
>>
>>
>> -Original Message-
>> From: Satvinder Singh [mailto:satvinder.si...@nc4.com]
>> Sent: Monday, May 16, 2016 11:54 AM
>> To: solr-user@lucene.apache.org
>> Subject: RE: Need Help with Solr 6.0 Cross Data Center Replication
>>
>> Hi,
>>
>> So the way I am doing it is, for both for the Target and Source side, I took 
>> a copy of the sample_techproducts_config configset, can created one 
>> configset. Then I modified the solrconfig.xml in there, both for the Target 
>> and Source side. And then created the collection, and I get the errors. I 
>> get the error if I create a new collection or try to reload an existing 
>> collection after the solrconfig update.
>> Attached is the log and configs.
>> Thanks
>>
>> Satvinder Singh
>>
>>
>>
>> Security Systems Engineer
>> satvinder.si...@nc4.com
>> 703.682.6000 x276 direct
>> 703.989.8030 cell
>> www.NC4.com
>>
>>
>>
>>
>>
>> ?
>>
>>
>> -Original Message-
>> From: Renaud Delbru [mailto:renaud@siren.solutions]
>> Sent: Monday, May 16, 2016 11:45 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Need Help with Solr 6.0 Cross Data Center Replication
>>
>> Hi,
>>
>> I have tried to reproduce the problem, but was unable to.
>> I have downloaded the Solr 6.0 distribution, added to the solr config the 
>> cdcr request handler and 

Re: SolrCloud SolrNode stopping randomly for no reason

2016-06-07 Thread Pablo Anzorena
Sorry for the poor details, but I didn't post the log files because there
was nothing out of the ordinary in the solr.log file, neither in
the solr-8984-console.log, nor in solr_gc.log.

What log do you want me to show you? solr.log.1 (which I think it should be
the last one) for example? You need the tail or the head of the file?

When I say "stopping for no reason" I mean the service is not running
anymore, the process is finished. I tried killing it with kill -9 command
and it does not log that, my first thought was that I restarted the
standalone solr service which try to stop the service and if it can't it
kills it doing SOLR_PROCESS_ID="`ps -eaf | grep -v "grep" | grep
"start.jar";kill -9 ${SOLR_PROCESS_ID}. So sometimes it could kill
solrcloud instead of standalone, but sometimes the datetime does not match.
Another option is that it gives an outofmemoryerror and the oom script is
killing the process, but again I saw nothing in the solr_gc.log.

2016-06-07 10:18 GMT-03:00 Shawn Heisey :

> On 6/7/2016 6:08 AM, Pablo Anzorena wrote:
> > I'am using SolrCloud with two nodes (5.2.1). One or two times a day the
> > node1 is stopping for no reason. I checked the logs but no errors are
> beign
> > logged.
> > I also have a standalone solr service in both nodes running in production
> > (we are doing the migration to SolrCloud, that's why).
>
> https://wiki.apache.org/solr/UsingMailingLists
>
> There are no real details to your message.  What precisely does
> "stopping for no reason" mean?  What does Solr *do*?  We cannot see your
> system, you must tell us what is happening with considerable detail.
>
> It seems highly unlikely that Solr would misbehave without logging
> *something*.  Are you looking at the Logging tab in the admin UI, or the
> actual solr.log file?  The solr.log file is the only reliable place to
> look.  When you restart Solr, the current logfile is renamed and a new
> solr.log will be created.
>
> Thanks,
> Shawn
>
>


Re: find stores with sales of > $x in last 2 months ?

2016-06-07 Thread Erick Erickson
Have you looked at the stats component here?
https://cwiki.apache.org/confluence/display/solr/The+Stats+Component

As for the rolling time period, date math is your friend:
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates

If working with date math in fq clauses, beware of using a bare NOW,
see:
http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/

Best,
Erick

On Mon, Jun 6, 2016 at 5:45 AM, Allison, Timothy B.  wrote:
> Thank you, Alex.
>
>> Sorry, your question a bit confusing.
> Y. Sorry.
>
>> Also, is this last month as in 'January' (rolling monthly) or as in 'last 30 
>> days'
> (rolling daily).
>
> Ideally, the latter, if this is possible to calculate dynamically in response 
> to a query.  My backoff method (if the 'rolling daily' method isn't 
> possible), would be to index monthly stats and then just use the range query 
> as you suggested.
>
> -Original Message-
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> Sent: Sunday, June 5, 2016 12:52 AM
> To: solr-user 
> Subject: Re: find stores with sales of > $x in last 2 months ?
>
> Are you asking for just numerical comparison during search or about a way to 
> aggregate numbers from multiple records? Also, is this last month as in 
> 'January' (rolling monthly) or as in 'last 30 days'
> (rolling daily). Sorry, your question a bit confusing.
>
> Numerical comparison is just a range (numField:[x TO *])  as per
>
> https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser#TheStandardQueryParser-RangeSearches
>
> https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser#TheStandardQueryParser-DifferencesbetweenLuceneQueryParserandtheSolrStandardQueryParser
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 3 June 2016 at 23:23, Allison, Timothy B.  wrote:
>> All,
>>   This is a toy example, but is there a way to search for, say, stores with 
>> sales of > $x in the last 2 months with Solr?
>>   $x and the time frame are selected by the user at query time.
>>
>> If the queries could be constrained (this is still tbd), I could see 
>> updating "stats" fields within each store document on a daily basis 
>> (sales_last_1_month, sales_last_2_months, sales_last_3_months...etc).  The 
>> dataset is fairly small and daily updates of this nature would not be 
>> prohibitive.
>>
>>Or, is this trying to use a screw driver where a hammer is required?
>>
>>Thank you.
>>
>>Best,
>>
>>  Tim


Re: Streaming expressions malfunctioning

2016-06-07 Thread Joel Bernstein
I'm not a big user of curl when I test the http interface. Most of the time
I use a browser and just send the request and view the results. This won't
be feasible if you're streaming a large number of documents but it will
work to quickly prototype expressions. I believe chrome will also handle
the URLencoding of the expression, but I usually URLencode the expression
before sending it down.


In Solr 6.1 the admin screen has a console for sending expressions which is
really nice. Also I believe Solr 6.1 does a better job of bubbling up the
error message all the way to the client.

In my testing all the expressions work from the http interface. But as they
get more complex it's easier to introduce errors into the syntax. If the
error that is returned to the client isn't clear enough you can check the
logs for the full stack trace.



Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Jun 7, 2016 at 10:38 AM, jpereira  wrote:

> EDIT: I'll keep testing with other stream sources/decorators. So far only
> the
> search endpoint works both in the JAVA and cURL implementation
>
> Cheers
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Streaming-expressions-malfunctioning-tp4281016p4281019.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Cloud Solr 5.3.1 + 6.0.1 cannot delete documents

2016-06-07 Thread Erick Erickson
OK, let's see the code you're using, including how you open your solrClient,
how you commit, and how you show that the deleted doc is still there. This
should be translatable into a test case. Like I said, this is tested in the unit
tests, so it would be good to see the difference between the test case
and what you're doing.

Best,
Erick

On Sun, Jun 5, 2016 at 1:47 PM, Moritz Becker  wrote:
> I just checked the shards again (with =false) and it seems that
> I was mistaken, the document does *not* reside in _different_ shards -
> everything good in this respect.
>
> However, I still have the issue that deleteById those not work whereas
> deleteByQuery works. Specifically, the following line does *not* work:
>
> UpdateResponse response = solrClient.deleteById(collection, );
>
> And the following line works:
>
> UpdateResponse response = solrClient.deleteByQuery(collection, "id:" +
> );
>
> I do not touch/change any other code when switching between these two
> modes and in both scenarios I use CloudSolrClient.
>
> Am 31.05.2016 um 05:32 schrieb Erick Erickson:
>> bq: I checked in the Solr Admin and noticed that the same document
>> resided in both shards on the same node
>>
>> If this means two _different_ shards (as opposed to two replicas in
>> the _same_ shard) showed the
>> document, then that's the proverbial "smoking gun", somehow your setup
>> isn't what you think
>> it is, perhaps you are somehow using implicit routing and routing the
>> doc with the same ID to
>> two different shards?
>>
>> try querying each of your replicas with =false to see if the
>> doc is somehow on two different
>> shards. If so, I suspect that's the root of your problems and figuring
>> out _how_ that happened
>> is the next step I'd recommend.
>>
>> As to why the raw URL deletes should work and CloudSolrClient doesn't,
>> CloudSolrClient
>> tries to send updates only to the shard that they should end up on. So
>> if your routing is
>> odd or you somehow have the same doc on two shards, the "wrong" shard 
>> wouldn't
>> see the delete. There's some speculation here BTW, I didn't trace
>> through the code...
>>
>> But this functionality is tested in the unit tests
>> (CloudSolrClientTest.java), so I suspect it's
>> something odd in your setup
>>
>> Best,
>> Erick
>>
>> On Mon, May 30, 2016 at 12:33 PM, Moritz Becker  wrote:
>>> Hi,
>>>
>>> I have the following issue:
>>> I initially started with a Solr 5.3.1 + Zookeeper 3.4.6 cloud setup with 2 
>>> solr nodes and with one collection consisting of 2 shards and 2 replicas.
>>>
>>> I am accessing the cluster using the CloudSolrClient. When I tried to 
>>> delete a document, no error occurred but after deletion and subsequent 
>>> commit, the document was still available via index queries.
>>> I checked in the Solr Admin and noticed that the same document resided in 
>>> both shards on the same node which I thought was odd.
>>> Also after deleting the collection and recreating it, the issue remained.
>>>
>>> Then I tried upgrading to latest Solr 6.0.1 with the same setup. Again, I 
>>> recreated the collection but I still could not delete the documents. Here 
>>> is a log snippet of the deletion attempt of a single document:
>>>
>>> 
>>>
>>> 126023 INFO  (qtp12209492-16) [c:cc5363_dm_documentversion s:shard1 
>>> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.p.LogUpdateProcessorFactory 
>>> [cc5363_dm_documentversion_shard1_replica1]  webapp=/solr path=/update 
>>> params={update.distrib=FROMLEADER=http://localhost:8983/solr/cc5363_dm_documentversion_shard1_replica2/=javabin=2}{delete=[12535
>>>  (-1535773473331216384)]} 0 16
>>> 126024 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 start 
>>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
>>> 126036 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: 
>>> org.apache.solr.search.SolrIndexSearcher
>>> 126038 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 end_commit_flush
>>> 126049 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 start 
>>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>>> 126050 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 No 

Re: Streaming expressions malfunctioning

2016-06-07 Thread jpereira
EDIT: I'll keep testing with other stream sources/decorators. So far only the
search endpoint works both in the JAVA and cURL implementation

Cheers



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Streaming-expressions-malfunctioning-tp4281016p4281019.html
Sent from the Solr - User mailing list archive at Nabble.com.


Streaming expressions malfunctioning

2016-06-07 Thread jpereira
Hi there,

I just recently upgraded a SOLR instance to 6.0.1 version and while been
trying the new streaming expressions feature I discovered what I think might
be a bug in the HTTP interface.

I tried to create two simple streaming expressions, as described below:

innerJoin(
search(collection1,zkhost="localhost:9983",qt="/export",fl="id",sort="id
asc",q="*:*"),
search(collection2,zkhost="localhost:9983",qt="/export",fl="id",sort="id
asc",q="*:*"),
on("id")
)

facet(
collection3,
q="*:*",
buckets="field1",
bucketSorts="sum(field2) asc",
sum(field2),
count(*)
)

What I noticed is while I can obtain the expressions results using JAVA, the
feature does not seem to function when I try to get the same data via cURL.
You can see the code snippets I used below. Can you tell me if I am doing
anything wrong or if this is indeed a bug in this version?

Inner Join expression

HTTP/PHP (not working)

Request:



JAVA (working)



Facet Expression

HTTP/PHP

Request:


JAVA



Thanks for the help!

Regards,

João Pereira



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Streaming-expressions-malfunctioning-tp4281016.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Solr to index zip files

2016-06-07 Thread Alexandre Rafalovitch
I _think_ DataImportHandler could handle zip files with fixed level of
nesting, but not read from HDFS.

I don't think anything else in Solr will. So, doing it outside of Solr
is probably best. Especially, since you would need to decide how you
actually want to map these files (e.g. do you keep the path for zip
within zip, etc).

Regards,
Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 7 June 2016 at 12:57,   wrote:
> Hi,
>
> I have an use case where I need to search zip files quickly in HDFS. I intend 
> to use Solr but not finding any relevant information about whether it can be 
> done for zip files.
> These are nested zip files i.e. zips within a zip file. Any help/information 
> is much appreciated.
>
> Thank you,
> Regards,
> Anupama
>
>
> If you are not the addressee, please inform us immediately that you have 
> received this e-mail by mistake, and delete it. We thank you for your support.
>


RE: Using Solr to index zip files

2016-06-07 Thread BURN, James
Hi
I think you'll need to do some unzipping of your zip files using an unzip 
application before you post to Solr. If you do this via a OS level batch script 
you can apply logic there to deal with nested zips. Then post your unzipped 
files to Solr via Curl.

James

-Original Message-
From: anupama.gangad...@daimler.com [mailto:anupama.gangad...@daimler.com] 
Sent: 07 June 2016 03:57
To: solr-user@lucene.apache.org
Subject: Using Solr to index zip files

Hi,

I have an use case where I need to search zip files quickly in HDFS. I intend 
to use Solr but not finding any relevant information about whether it can be 
done for zip files.
These are nested zip files i.e. zips within a zip file. Any help/information is 
much appreciated.

Thank you,
Regards,
Anupama


If you are not the addressee, please inform us immediately that you have 
received this e-mail by mistake, and delete it. We thank you for your support.

Oxford University Press (UK) Disclaimer

This message is confidential. You should not copy it or disclose its contents 
to anyone. You may use and apply the information for the intended purpose only. 
OUP does not accept legal responsibility for the contents of this message. Any 
views or opinions presented are those of the author only and not of OUP. If 
this email has come to you in error, please delete it, along with any 
attachments. Please note that OUP may intercept incoming and outgoing email 
communications.


Re: Solr 5.4 Transaction

2016-06-07 Thread Shawn Heisey
On 6/7/2016 12:41 AM, Pithon Philippe wrote:
> I have a question on Solr Transaction as relational databases The Solr
> commit is not isolated for each client session, right? In my test
> (source below) The commit in a session adds records of other sessions
> is there a documentation on this? is what's planned improvements on
> this? version 6, version 7? 

No improvements are planned.

You've gotten a few replies telling you that if you want real
transactions in Solr, you'll have to implement them.  This is because
Lucene (the API that provides most of Solr's functionality) does not do
transactions like a relational database does.  That's the important
thing to remember here.  Lucene is a *search* API, not a *database*
API.  Solr is a search server, not a database server.

Adding the kind of transactions you want would require fundamental (and
likely large-scale) changes to the low-level design, quite possibly at
the Lucene level rather than Solr.  There's a very good chance that
adding this would slow down the primary feature -- search.  It would
also require considerable effort -- both to initially write and to
maintain/debug.

Solr is advertised as NoSQL.  If they are configured and used correctly,
Lucene-based applications (like Solr) can fill a NoSQL role, but this is
not the GOAL of the software.  I get a little worried every time I see
the release note claiming NoSQL capability, because when they see that,
I know that users are going to request additional features that search
doesn't need, like transactions.

Something else that I saw in the replies you already got:  If you really
need transaction support, find existing software that's been doing it
for a lot of years, and use that software as a canonical data source to
populate your Solr index.

Thanks,
Shawn



Re: Solr 6.1.x Release Date ??

2016-06-07 Thread Mikhail Khludnev
It's not an answer to your question, but have you found it in nightly
build?

On Tue, Jun 7, 2016 at 8:38 AM, Ramesh Shankar  wrote:

> Hi,
>
> Any idea of Solr 6.1.X Release Date ??
>
> I am interested in the [subquery] transformer and like to know the release
> date since its available only in 6.1.x
>
> Thanks & Regards
> Ramesh
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: Solr 5.4 Transaction

2016-06-07 Thread Mikhail Khludnev
Ah, it's easy. It reminds me the fact that ESB don't use transactions in
favor of compansations.
Allocate some txId, assign it on every doc then if you need to rollback
submit deleteByQuery txId:<666>, and commit. Although, it provides phantom
and dirty reads both, it's really cheap. If you need something super atomic
evaluate addIndex approach.

On Tue, Jun 7, 2016 at 12:14 PM, Pithon Philippe 
wrote:

> Thanks,
> my problem is for rollback transaction by user...
> if there is a problem in trycatch for a user, the rollback run for all
> users commit...
> Philippe
>
> 2016-06-07 9:23 GMT+02:00 Mikhail Khludnev :
>
> > Hello,
> >
> > That's how Lucene work underneath.
> > Just ad-hoc idea you can have isolated per session indices and then add
> > them altogether. Thinking deeper, we can consider per thread invertors
> > which somehow isolated, perhaps something transactional might be built on
> > top of them, but it's really deep hack.
> > Anyway, it.s not clear what you want to achieve and why by this way.
> Also,
> > this question suits for dev@ much more.
> > 07 июня 2016 г. 9:41 пользователь "Pithon Philippe" <
> > ppithon.si...@gmail.com>
> > написал:
> >
> > > Hi,
> > > I have a question on Solr Transaction as relational databases
> > >
> > > The Solr commit is not isolated for each client session, right?
> > > In my test (source below) The commit in a session adds records of other
> > > sessions
> > > is there a documentation on this?
> > > is what's planned improvements on this? version 6, version 7?
> > > Thank you for any ideas!
> > >
> > >
> > > Source example  :
> > >
> > > public class TransactionTest {
> > >
> > > static final String BASE_URL = "http://localhost:8983/solr/test;;
> > >
> > > public static void main(String[] args) {
> > >
> > > try {
> > > new TransactionTest();
> > > } catch (Exception e) {
> > > e.printStackTrace();
> > > }
> > >
> > > }
> > >
> > > public TransactionTest() throws Exception {
> > >
> > > HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);
> > >
> > > DTOMail mail = new DTOMail();
> > > mail.setType("mail");
> > > mail.setBody("test body");
> > >
> > > System.out.println("add been");
> > > solrClient.addBean(mail);
> > >
> > > pause();
> > >
> > > System.out.println("commit");
> > > solrClient.commit();
> > >
> > > solrClient.close();
> > > }
> > >
> > > private void pause() {
> > >
> > > try {
> > > System.in.read();
> > > } catch (Exception e) {
> > > }
> > >
> > > }
> > >
> > > }
> > >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: SolrCloud SolrNode stopping randomly for no reason

2016-06-07 Thread Shawn Heisey
On 6/7/2016 6:08 AM, Pablo Anzorena wrote:
> I'am using SolrCloud with two nodes (5.2.1). One or two times a day the
> node1 is stopping for no reason. I checked the logs but no errors are beign
> logged.
> I also have a standalone solr service in both nodes running in production
> (we are doing the migration to SolrCloud, that's why).

https://wiki.apache.org/solr/UsingMailingLists

There are no real details to your message.  What precisely does
"stopping for no reason" mean?  What does Solr *do*?  We cannot see your
system, you must tell us what is happening with considerable detail.

It seems highly unlikely that Solr would misbehave without logging
*something*.  Are you looking at the Logging tab in the admin UI, or the
actual solr.log file?  The solr.log file is the only reliable place to
look.  When you restart Solr, the current logfile is renamed and a new
solr.log will be created.

Thanks,
Shawn



SolrCloud SolrNode stopping randomly for no reason

2016-06-07 Thread Pablo Anzorena
Hey,

I'am using SolrCloud with two nodes (5.2.1). One or two times a day the
node1 is stopping for no reason. I checked the logs but no errors are beign
logged.
I also have a standalone solr service in both nodes running in production
(we are doing the migration to SolrCloud, that's why).

Thanks.


Re: Solr 5.4 Transaction

2016-06-07 Thread Upayavira
There is no equivalent to transactions in Lucene or Solr. A commit
persists newly added documents to disk - all of them, regardless of
which client sent them. A rollback discards all uncommitted documents,
regardless of which client sent them, which renders a rollback pretty
un-useful.

If you need a transaction system, you're gonna need to manage it
yourself. You could, before updating some documents, retrieve and store
them. Then, should the transaction fail, you post your old documents
over the top of your changes. You might be able to use some of the
features of partial updates (e.g. the _version_ field) to ensure that
you don't accidentally overwrite someone else's changes.

But basically, if you want a transaction system, you're gonna have to
build it yourself. (Or, push the data to a database that *does* support
transactions, and then have a process that pulls updates from the DB to
Solr).

Upayavira

On Tue, 7 Jun 2016, at 10:52 AM, Vincenzo D'Amore wrote:
> Hi Pithon,
> 
> I have to state beforehand that I worked with transactions on Solr 4.8.1,
> so I'm not sure the transactions are changed in Solr over time.
> And, I have to add, the transactions support not is very well documented,
> so most of what I know is based on my experience.
> 
> Given that, Solr clients connects via HTTP, which is a stateless
> protocol,
> so many clients can connect concurrently.
> That's important point, and IMHO it make a great difference from other
> products that have clients with an "open connection", which enable them
> to
> have a transaction at protocol level.
> 
> So this means that when a client commit (or rollback)  to all active
> clients are involved.
> When a client rollbacks, all the previous work done by all the clients
> will
> be rollbacked.
> 
> That's why, if you really need to rollback, only one client should be
> connected at a time.
> 
> I think, if you really want implement transactions with Solr, what you
> can
> do is add a layer, your own layer, that transparently give this
> functionality.
> 
> Best regards,
> Vincenzo
> 
> 
> 
> On Tue, Jun 7, 2016 at 11:14 AM, Pithon Philippe
> 
> wrote:
> 
> > Thanks,
> > my problem is for rollback transaction by user...
> > if there is a problem in trycatch for a user, the rollback run for all
> > users commit...
> > Philippe
> >
> > 2016-06-07 9:23 GMT+02:00 Mikhail Khludnev :
> >
> > > Hello,
> > >
> > > That's how Lucene work underneath.
> > > Just ad-hoc idea you can have isolated per session indices and then add
> > > them altogether. Thinking deeper, we can consider per thread invertors
> > > which somehow isolated, perhaps something transactional might be built on
> > > top of them, but it's really deep hack.
> > > Anyway, it.s not clear what you want to achieve and why by this way.
> > Also,
> > > this question suits for dev@ much more.
> > > 07 июня 2016 г. 9:41 пользователь "Pithon Philippe" <
> > > ppithon.si...@gmail.com>
> > > написал:
> > >
> > > > Hi,
> > > > I have a question on Solr Transaction as relational databases
> > > >
> > > > The Solr commit is not isolated for each client session, right?
> > > > In my test (source below) The commit in a session adds records of other
> > > > sessions
> > > > is there a documentation on this?
> > > > is what's planned improvements on this? version 6, version 7?
> > > > Thank you for any ideas!
> > > >
> > > >
> > > > Source example  :
> > > >
> > > > public class TransactionTest {
> > > >
> > > > static final String BASE_URL = "http://localhost:8983/solr/test;;
> > > >
> > > > public static void main(String[] args) {
> > > >
> > > > try {
> > > > new TransactionTest();
> > > > } catch (Exception e) {
> > > > e.printStackTrace();
> > > > }
> > > >
> > > > }
> > > >
> > > > public TransactionTest() throws Exception {
> > > >
> > > > HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);
> > > >
> > > > DTOMail mail = new DTOMail();
> > > > mail.setType("mail");
> > > > mail.setBody("test body");
> > > >
> > > > System.out.println("add been");
> > > > solrClient.addBean(mail);
> > > >
> > > > pause();
> > > >
> > > > System.out.println("commit");
> > > > solrClient.commit();
> > > >
> > > > solrClient.close();
> > > > }
> > > >
> > > > private void pause() {
> > > >
> > > > try {
> > > > System.in.read();
> > > > } catch (Exception e) {
> > > > }
> > > >
> > > > }
> > > >
> > > > }
> > > >
> > >
> >
> 
> 
> 
> -- 
> Vincenzo D'Amore
> email: v.dam...@gmail.com
> skype: free.dev
> mobile: +39 349 8513251


solrj SpellCheck request fails

2016-06-07 Thread Rohana Rajapakse
QueryResponse class's _spellResponse is not populated.  The following two lines 
worked in Solr-4.x, but Solr-6/7

SolrClient server = new HttpSolrClient();
QueryResponse response = server.query(new 
SolrQuery("bus").setParam("spellcheck", "true"));  //this returns results
SpellCheckResponse scr = response.getSpellCheckResponse();  //Returns null 
because _spellResponse in the QueryResponse class is not populated


Any ideas?

Rohana



Registered Office: 24 Darklake View, Estover, Plymouth, PL6 7TL.
Company Registration No: 3553908

This email contains proprietary information, some or all of which may be 
legally privileged. It is for the intended recipient only. If an addressing or 
transmission error has misdirected this email, please notify the author by 
replying to this email. If you are not the intended recipient you may not use, 
disclose, distribute, copy, print or rely on this email.

Email transmission cannot be guaranteed to be secure or error free, as 
information may be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete or contain viruses. This email and any files attached to it have 
been checked with virus detection software before transmission. You should 
nonetheless carry out your own virus check before opening any attachment. GOSS 
Interactive Ltd accepts no liability for any loss or damage that may be caused 
by software viruses.




Re: Solr 5.4 Transaction

2016-06-07 Thread Vincenzo D'Amore
Hi Pithon,

I have to state beforehand that I worked with transactions on Solr 4.8.1,
so I'm not sure the transactions are changed in Solr over time.
And, I have to add, the transactions support not is very well documented,
so most of what I know is based on my experience.

Given that, Solr clients connects via HTTP, which is a stateless protocol,
so many clients can connect concurrently.
That's important point, and IMHO it make a great difference from other
products that have clients with an "open connection", which enable them to
have a transaction at protocol level.

So this means that when a client commit (or rollback)  to all active
clients are involved.
When a client rollbacks, all the previous work done by all the clients will
be rollbacked.

That's why, if you really need to rollback, only one client should be
connected at a time.

I think, if you really want implement transactions with Solr, what you can
do is add a layer, your own layer, that transparently give this
functionality.

Best regards,
Vincenzo



On Tue, Jun 7, 2016 at 11:14 AM, Pithon Philippe 
wrote:

> Thanks,
> my problem is for rollback transaction by user...
> if there is a problem in trycatch for a user, the rollback run for all
> users commit...
> Philippe
>
> 2016-06-07 9:23 GMT+02:00 Mikhail Khludnev :
>
> > Hello,
> >
> > That's how Lucene work underneath.
> > Just ad-hoc idea you can have isolated per session indices and then add
> > them altogether. Thinking deeper, we can consider per thread invertors
> > which somehow isolated, perhaps something transactional might be built on
> > top of them, but it's really deep hack.
> > Anyway, it.s not clear what you want to achieve and why by this way.
> Also,
> > this question suits for dev@ much more.
> > 07 июня 2016 г. 9:41 пользователь "Pithon Philippe" <
> > ppithon.si...@gmail.com>
> > написал:
> >
> > > Hi,
> > > I have a question on Solr Transaction as relational databases
> > >
> > > The Solr commit is not isolated for each client session, right?
> > > In my test (source below) The commit in a session adds records of other
> > > sessions
> > > is there a documentation on this?
> > > is what's planned improvements on this? version 6, version 7?
> > > Thank you for any ideas!
> > >
> > >
> > > Source example  :
> > >
> > > public class TransactionTest {
> > >
> > > static final String BASE_URL = "http://localhost:8983/solr/test;;
> > >
> > > public static void main(String[] args) {
> > >
> > > try {
> > > new TransactionTest();
> > > } catch (Exception e) {
> > > e.printStackTrace();
> > > }
> > >
> > > }
> > >
> > > public TransactionTest() throws Exception {
> > >
> > > HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);
> > >
> > > DTOMail mail = new DTOMail();
> > > mail.setType("mail");
> > > mail.setBody("test body");
> > >
> > > System.out.println("add been");
> > > solrClient.addBean(mail);
> > >
> > > pause();
> > >
> > > System.out.println("commit");
> > > solrClient.commit();
> > >
> > > solrClient.close();
> > > }
> > >
> > > private void pause() {
> > >
> > > try {
> > > System.in.read();
> > > } catch (Exception e) {
> > > }
> > >
> > > }
> > >
> > > }
> > >
> >
>



-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251


Re: Solr 5.4 Transaction

2016-06-07 Thread Pithon Philippe
Thanks,
my problem is for rollback transaction by user...
if there is a problem in trycatch for a user, the rollback run for all
users commit...
Philippe

2016-06-07 9:23 GMT+02:00 Mikhail Khludnev :

> Hello,
>
> That's how Lucene work underneath.
> Just ad-hoc idea you can have isolated per session indices and then add
> them altogether. Thinking deeper, we can consider per thread invertors
> which somehow isolated, perhaps something transactional might be built on
> top of them, but it's really deep hack.
> Anyway, it.s not clear what you want to achieve and why by this way. Also,
> this question suits for dev@ much more.
> 07 июня 2016 г. 9:41 пользователь "Pithon Philippe" <
> ppithon.si...@gmail.com>
> написал:
>
> > Hi,
> > I have a question on Solr Transaction as relational databases
> >
> > The Solr commit is not isolated for each client session, right?
> > In my test (source below) The commit in a session adds records of other
> > sessions
> > is there a documentation on this?
> > is what's planned improvements on this? version 6, version 7?
> > Thank you for any ideas!
> >
> >
> > Source example  :
> >
> > public class TransactionTest {
> >
> > static final String BASE_URL = "http://localhost:8983/solr/test;;
> >
> > public static void main(String[] args) {
> >
> > try {
> > new TransactionTest();
> > } catch (Exception e) {
> > e.printStackTrace();
> > }
> >
> > }
> >
> > public TransactionTest() throws Exception {
> >
> > HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);
> >
> > DTOMail mail = new DTOMail();
> > mail.setType("mail");
> > mail.setBody("test body");
> >
> > System.out.println("add been");
> > solrClient.addBean(mail);
> >
> > pause();
> >
> > System.out.println("commit");
> > solrClient.commit();
> >
> > solrClient.close();
> > }
> >
> > private void pause() {
> >
> > try {
> > System.in.read();
> > } catch (Exception e) {
> > }
> >
> > }
> >
> > }
> >
>


Re: Solr 6 CDCR does not work

2016-06-07 Thread Uwe Reh

Hi Adam,

maybe it's my poor English, but I'm confused.
I've taken Renault's quote as a hint to activate autocommit on the 
target cluster. Or at least doing manually frequent commits, to see the 
replicated documents.

Now you wrote disabling autocommit helps.

Could you please clarify this point?

Regards
Uwe


Am 01.06.2016 um 12:28 schrieb Adam Majid Sanjaya:

disable autocommit on the target

It worked!
thanks

2016-05-30 15:40 GMT+07:00 Renaud Delbru :


Hi Adam,
...
Also, do you have an autocommit configured on the target ? CDCR does not
replicate commit, and therefore you have to send a commit command on the
target to ensure that the latest replicated documents are visible.
...
--
Renaud Delbru



Re: Solr 5.4 Transaction

2016-06-07 Thread Mikhail Khludnev
Hello,

That's how Lucene work underneath.
Just ad-hoc idea you can have isolated per session indices and then add
them altogether. Thinking deeper, we can consider per thread invertors
which somehow isolated, perhaps something transactional might be built on
top of them, but it's really deep hack.
Anyway, it.s not clear what you want to achieve and why by this way. Also,
this question suits for dev@ much more.
07 июня 2016 г. 9:41 пользователь "Pithon Philippe" 
написал:

> Hi,
> I have a question on Solr Transaction as relational databases
>
> The Solr commit is not isolated for each client session, right?
> In my test (source below) The commit in a session adds records of other
> sessions
> is there a documentation on this?
> is what's planned improvements on this? version 6, version 7?
> Thank you for any ideas!
>
>
> Source example  :
>
> public class TransactionTest {
>
> static final String BASE_URL = "http://localhost:8983/solr/test;;
>
> public static void main(String[] args) {
>
> try {
> new TransactionTest();
> } catch (Exception e) {
> e.printStackTrace();
> }
>
> }
>
> public TransactionTest() throws Exception {
>
> HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);
>
> DTOMail mail = new DTOMail();
> mail.setType("mail");
> mail.setBody("test body");
>
> System.out.println("add been");
> solrClient.addBean(mail);
>
> pause();
>
> System.out.println("commit");
> solrClient.commit();
>
> solrClient.close();
> }
>
> private void pause() {
>
> try {
> System.in.read();
> } catch (Exception e) {
> }
>
> }
>
> }
>


Solr 5.4 Transaction

2016-06-07 Thread Pithon Philippe
Hi,
I have a question on Solr Transaction as relational databases

The Solr commit is not isolated for each client session, right?
In my test (source below) The commit in a session adds records of other
sessions
is there a documentation on this?
is what's planned improvements on this? version 6, version 7?
Thank you for any ideas!


Source example  :

public class TransactionTest {

static final String BASE_URL = "http://localhost:8983/solr/test;;

public static void main(String[] args) {

try {
new TransactionTest();
} catch (Exception e) {
e.printStackTrace();
}

}

public TransactionTest() throws Exception {

HttpSolrClient solrClient = new HttpSolrClient(BASE_URL);

DTOMail mail = new DTOMail();
mail.setType("mail");
mail.setBody("test body");

System.out.println("add been");
solrClient.addBean(mail);

pause();

System.out.println("commit");
solrClient.commit();

solrClient.close();
}

private void pause() {

try {
System.in.read();
} catch (Exception e) {
}

}

}