date:20180404

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

2018-04-04 Thread Doss

@wunder

Are you sending updates in batches? Are you doing a commit after every
update? 

>> We want the system to be near real time, so we are not doing updates in
>> batches and also we are not doing commit after every update.
>> autoSoftCommit once in every minute, and autoCommit once in every  10
>> minutes.

This thread increase is not happening in all the time, on our peak hours
where used we to get more user interactions the system works absolutely
fine, suddenly this problem creeps up and system gets into trouble.

nproc value increased 18000. 

Did jetty related linux fine tuning  as described in the below link

http://www.eclipse.org/jetty/documentation/current/high-load.html

Thanks.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Support LTR RankQuery with Grouping

2018-04-04 Thread ilayaraja

I am facing issue with LTR query not supported with grouping.

I see the patch for this has been raised here
https://issues.apache.org/jira/browse/SOLR-8776

Is it available in solr/master (7.2.2) now?

Looks like this patch is not merged yet. 




-
--Ilay
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

2018-04-04 Thread Walter Underwood

Are you sending updates in batches? Are you doing a commit after every update?

You should use batches and you should not commit after every update.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 4, 2018, at 8:01 PM, 苗海泉  wrote:
> 
> A lot of collection time, we also found that there are a lot of time_wait
> thread, mainly committed submit thread and search thread, which led to the
> rapid decline in the speed of solr, the number of these threads up to more
> than 2,000.
> I didn't have a solution to this problem, but I found that reloading would
> reduce the number of commit threads.
> 
> 
> 
> ‌
>  Sent with Mailtrack
> 
> 
> 2018-04-04 13:46 GMT+08:00 Doss :
> 
>> We have SOLR(7.0.1) cloud 3 VM Linux instances wit 4 CPU, 90 GB RAM with
>> zookeeper (3.4.11) ensemble running on the same machines. We have 130 cores
>> of overall size of 45GB. No Sharding, almost all VMs has the same copy of
>> data. These nodes are under LB.
>> 
>> Index Config:
>> =
>> 
>> 300
>> 
>>   30
>>   100
>>   30.0
>> 
>> 
>> 
>>   18
>>   6
>> 
>> 
>> Commit Configs:
>> ===
>> 
>>   ${solr.autoCommit.maxTime:60}
>>   false
>> 
>> 
>> 
>>   ${solr.autoSoftCommit.maxTime:6}
>> 
>> 
>> 
>> We do 3500 Insert / Updates per second spread across all 130 cores, We yet
>> to start using selects effectively.
>> 
>> The problem what we are facing is at times suddenly the thread count
>> increase heavily which results SOLR non responsive or throwing 503 response
>> for client (PHP HTTP CURL) requests.
>> 
>> Today 04-04-2018 the thread dump shows that the peak went upto 13000+
>> 
>> Please hlep me in fixing this issue. Thanks!
>> 
>> 
>> Sample Threads:
>> ===
>> 
>> 1.updateExecutor-2-thread-25746-processing-http:
>> 172.10.2.19:8983//solr//profileviews x:profileviews r:core_node2
>> n:172.10.2.18:8983_solr s:shard1 c:profileviews", "state":"TIMED_WAITING",
>> "lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$
>> ConditionObject@297be1d5",
>> "cpuTime":"162.4371ms", "userTime":"120.ms",
>> "stackTrace":["sun.misc.Unsafe.park(Native Method)",
>> "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)",
>> "java.util.concurrent.locks.AbstractQueuedSynchronizer$
>> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)",
>> 
>> 2. ERROR true HttpSolrCall
>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
>> DistributedUpdatesAsyncException:
>> Async exception during distributed update: Error from server at
>> 172.10.2.18:8983/solr/profileviews: Server Error request:
>> http://172.10.2.18:8983/solr/profileviews/update?update.
>> distrib=TOLEADER=http%3A%2F%2F172.10.2.19%
>> 3A8983%2Fsolr%2Fprofileviews%2F=javabin=2
>> Remote error message: empty String
>> 
>> 3. So Many Threads like:
>> "name":"qtp959447386-21",
>>"state":"TIMED_WAITING",
>> 
>> "lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$
>> ConditionObject@6a1a2bf4
>> ",
>>"cpuTime":"4522.0837ms",
>>"userTime":"3770.ms",
>>"stackTrace":["sun.misc.Unsafe.park(Native Method)",
>> 
>> "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)",
>> 
>> "java.util.concurrent.locks.AbstractQueuedSynchronizer$
>> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)",
>> 
>> "org.eclipse.jetty.util.BlockingArrayQueue.poll(
>> BlockingArrayQueue.java:392)",
>> 
>> "org.eclipse.jetty.util.thread.QueuedThreadPool.
>> idleJobPoll(QueuedThreadPool.java:563)",
>> 
>> "org.eclipse.jetty.util.thread.QueuedThreadPool.
>> access$800(QueuedThreadPool.java:48)",
>> 
>> "org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
>> QueuedThreadPool.java:626)",
>>  "java.lang.Thread.run(Thread.java:748)"
>> 
> 
> 
> 
> -- 
> ==
> 联创科技
> 知行如一
> ==

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

2018-04-04 Thread 苗海泉

A lot of collection time, we also found that there are a lot of time_wait
thread, mainly committed submit thread and search thread, which led to the
rapid decline in the speed of solr, the number of these threads up to more
than 2,000.
I didn't have a solution to this problem, but I found that reloading would
reduce the number of commit threads.



‌
 Sent with Mailtrack


2018-04-04 13:46 GMT+08:00 Doss :

> We have SOLR(7.0.1) cloud 3 VM Linux instances wit 4 CPU, 90 GB RAM with
> zookeeper (3.4.11) ensemble running on the same machines. We have 130 cores
> of overall size of 45GB. No Sharding, almost all VMs has the same copy of
> data. These nodes are under LB.
>
> Index Config:
> =
>
> 300
> 
>30
>100
>30.0
> 
>
> 
>18
>6
> 
>
> Commit Configs:
> ===
> 
>${solr.autoCommit.maxTime:60}
>false
> 
>
> 
>${solr.autoSoftCommit.maxTime:6}
> 
>
>
> We do 3500 Insert / Updates per second spread across all 130 cores, We yet
> to start using selects effectively.
>
> The problem what we are facing is at times suddenly the thread count
> increase heavily which results SOLR non responsive or throwing 503 response
> for client (PHP HTTP CURL) requests.
>
> Today 04-04-2018 the thread dump shows that the peak went upto 13000+
>
> Please hlep me in fixing this issue. Thanks!
>
>
> Sample Threads:
> ===
>
> 1.updateExecutor-2-thread-25746-processing-http:
> 172.10.2.19:8983//solr//profileviews x:profileviews r:core_node2
> n:172.10.2.18:8983_solr s:shard1 c:profileviews", "state":"TIMED_WAITING",
> "lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject@297be1d5",
> "cpuTime":"162.4371ms", "userTime":"120.ms",
> "stackTrace":["sun.misc.Unsafe.park(Native Method)",
> "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)",
> "java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)",
>
> 2. ERROR true HttpSolrCall
> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> DistributedUpdatesAsyncException:
> Async exception during distributed update: Error from server at
> 172.10.2.18:8983/solr/profileviews: Server Error request:
> http://172.10.2.18:8983/solr/profileviews/update?update.
> distrib=TOLEADER=http%3A%2F%2F172.10.2.19%
> 3A8983%2Fsolr%2Fprofileviews%2F=javabin=2
> Remote error message: empty String
>
> 3. So Many Threads like:
> "name":"qtp959447386-21",
> "state":"TIMED_WAITING",
>
> "lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject@6a1a2bf4
> ",
> "cpuTime":"4522.0837ms",
> "userTime":"3770.ms",
> "stackTrace":["sun.misc.Unsafe.park(Native Method)",
>
> "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)",
>
> "java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)",
>
> "org.eclipse.jetty.util.BlockingArrayQueue.poll(
> BlockingArrayQueue.java:392)",
>
> "org.eclipse.jetty.util.thread.QueuedThreadPool.
> idleJobPoll(QueuedThreadPool.java:563)",
>
> "org.eclipse.jetty.util.thread.QueuedThreadPool.
> access$800(QueuedThreadPool.java:48)",
>
> "org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:626)",
>   "java.lang.Thread.run(Thread.java:748)"
>



-- 
==
联创科技
知行如一
==

Re: Need help to get started on Solr, searching get nothing. Thank you very much in advance

2018-04-04 Thread Raymond Xie

I have the data ready for index now, it is a json file:

{"122": "20180320-08:08:35.038", "49": "VIPER", "382": "0", "151": "1.0",
"9": "653", "10071": "20180320-08:08:35.088", "15": "JPY", "56": "XSVC",
"54": "1", "10202": "APMKTMAKING", "10537": "XOSE", "10217": "Y", "48":
"179492540", "201": "1", "40": "2", "8": "FIX.4.4", "167": "OPT", "421":
"JPN", "10292": "115", "10184": "3379122", "456": "101", "11210":
"3379122", "1133": "G", "10515": "178", "10": "200", "11032": "-1",
"10436": "20180320-08:08:35.038", "10518": "178", "11":
"3379122", "75":
"20180320", "10005": "178", "10104": "Y", "35": "RIO", "10208":
"APAC.VIPER.OOE", "59": "0", "60": "20180320-08:08:35.088", "528": "P",
"581": "13", "1": "TEST", "202": "25375.0", "455": "179492540", "55":
"JNI253D8.OS", "100": "XOSE", "52": "20180320-08:08:35.088", "10241":
"viperooe", "150": "A", "10039": "viperooe", "39": "A", "10438": "RIO.4.5",
"38": "1", "37": "3379122", "372": "D", "660": "102", "44": "2.0",
"10066": "20180320-08:08:35.038", "29": "4", "50": "JPNIK01", "22": "101"}

You can inspect the json here: https://jsonformatter.org/

I need to create index and enable searching on tags: 37, 75 and 10242
(where available, this sample message doesn't have it)

My understanding is I need to create the file managed-schema, I added two
fields as below:




Then I go back to Solr Admin, I don't see the two new fields in Schema
section

Anything I am missing here? and once the two fields are put in the
managed-schema, can I add the json file through upload in Solr Admin?

Thank you very much.






**
*Sincerely yours,*


*Raymond*

On Mon, Apr 2, 2018 at 9:04 AM, Rick Leir  wrote:

> Raymond
> There is a default field normally called df. You would normally use
> Copyfield to copy all searchable fields into the default field.
> Cheers -- Rick
>
> On April 1, 2018 11:34:07 PM EDT, Raymond Xie 
> wrote:
> >Hi Rick,
> >
> >I sorted it out half:
> >
> >I should have specified the field in the search query, so, instead of
> >http://localhost:8983/solr/films/browse?q=batman, I should use:
> >http://localhost:8983/solr/films/browse?q=name:batman
> >
> >Sorry for this newbie mistake.
> >
> >But what about if I/user doesn't know or doesn't want to specify the
> >search
> >scope to be restricted in field "name" but anywhere in the index'ed
> >documents?
> >
> >
> >**
> >*Sincerely yours,*
> >
> >
> >*Raymond*
> >
> >On Sun, Apr 1, 2018 at 2:10 PM, Rick Leir  wrote:
> >
> >> Raymond
> >> The output is not visible to me because the mailing list strips
> >images.
> >> Please try a different way to show the output.
> >> Cheers -- Rick
> >>
> >> On March 29, 2018 10:17:13 PM EDT, Raymond Xie 
> >> wrote:
> >> > I am new to Solr, following Steve Rowe's example on
> >>
> >>https://github.com/apache/lucene-solr/tree/master/solr/example/films:
> >> >
> >> >It would be greatly appreciated if anyone can enlighten me where to
> >> >start
> >> >troubleshooting, thank you very much in advance.
> >> >
> >> >The steps I followed are:
> >> >
> >> >Here ya go << END_OF_SCRIPT
> >> >
> >> >bin/solr stop
> >> >rm server/logs/*.log
> >> >rm -Rf server/solr/films/
> >> >bin/solr start
> >> >bin/solr create -c films
> >> >curl http://localhost:8983/solr/films/schema -X POST -H
> >> >'Content-type:application/json' --data-binary '{
> >> >"add-field" : {
> >> >"name":"name",
> >> >"type":"text_general",
> >> >"multiValued":false,
> >> >"stored":true
> >> >},
> >> >"add-field" : {
> >> >"name":"initial_release_date",
> >> >"type":"pdate",
> >> >"stored":true
> >> >}
> >> >}'
> >> >bin/post -c films example/films/films.json
> >> >curl http://localhost:8983/solr/films/config/params -H
> >> >'Content-type:application/json'  -d '{
> >> >"update" : {
> >> >  "facets": {
> >> >"facet.field":"genre"
> >> >}
> >> >  }
> >> >}'
> >> >
> >> ># END_OF_SCRIPT
> >> >
> >> >Additional fun -
> >> >
> >> >Add highlighting:
> >> >curl http://localhost:8983/solr/films/config/params -H
> >> >'Content-type:application/json'  -d '{
> >> >"set" : {
> >> >  "browse": {
> >> >"hl":"on",
> >> >"hl.fl":"name"
> >> >}
> >> >  }
> >> >}'
> >> >try http://localhost:8983/solr/films/browse?q=batman now, and you'll
> >> >see "batman" highlighted in the results
> >> >
> >> >
> >> >
> >> >I got nothing in my search:
> >> >
> >> >
> >> >
> >> >
> >> >**
> >> >*Sincerely yours,*
> >> >
> >> >
> >> >*Raymond*
> >>
> >> --
> >> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: Largest number of indexed documents used by Solr

2018-04-04 Thread 苗海泉

When we have 49 shards per collection, there are more than 600 collections.
Solr will have serious performance problems. I don't know how to deal with
them. My advice to you is to minimize the number of collections.
Our environment is 49 solr server nodes, each with 32cpu/128g, and the data
volume is about 50 billion per day.


‌
 Sent with Mailtrack


2018-04-04 9:23 GMT+08:00 Yago Riveiro :

> Hi,
>
> In my company we are running a 12 node cluster with 10 (american) Billion
> documents 12 shards / 2 replicas.
>
> We do mainly faceting queries with a very reasonable performance.
>
> 36 million documents it's not an issue, you can handle that volume of
> documents with 2 nodes with SSDs and 32G of ram
>
> Regards.
>
> --
>
> Yago Riveiro
>
> On 4 Apr 2018 02:15 +0100, Abhi Basu <9000r...@gmail.com>, wrote:
> > We have tested Solr 4.10 with 200 million docs with avg doc size of 250
> KB.
> > No issues with performance when using 3 shards / 2 replicas.
> >
> >
> >
> > On Tue, Apr 3, 2018 at 8:12 PM, Steven White 
> wrote:
> >
> > > Hi everyone,
> > >
> > > I'm about to start a project that requires indexing 36 million records
> > > using Solr 7.2.1. Each record range from 500 KB to 0.25 MB where the
> > > average is 0.1 MB.
> > >
> > > Has anyone indexed this number of records? What are the things I should
> > > worry about? And out of curiosity, what is the largest number of
> records
> > > that Solr has indexed which is published out there?
> > >
> > > Thanks
> > >
> > > Steven
> > >
> >
> >
> >
> > --
> > Abhi Basu
>



-- 
==
联创科技
知行如一
==

[ANNOUNCE] Apache Solr 7.3.0 released

2018-04-04 Thread Alan Woodward

4th April 2018, Apache Solr™ 7.3.0 available

The Lucene PMC is pleased to announce the release of Apache Solr 7.3.0

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

This release includes the following changes since the 7.2.0 release:

- A new update request processor supports OpenNLP-based entity extraction
and language detection
- Support for automatic time-based collection creation
- Multivalued primitive fields can now be used in sorting
- A new SortableTextField allows both indexing and sorting/faceting on free
text
- Several new stream evaluators
- Improvements around leader-initiated recovery
- New autoscaling features: triggers can perform operations based on any
metric available from the Metrics API, based on a defined schedule, or in
response to a query rate over a 1-minute average. A new screen in the Admin
UI will show suggested autoscaling actions.
- Metrics can now be exported to Prometheus
- {!parent} and {!child} support filtering with exclusions via new local
parameters
- Introducing {!filters} query parser for referencing filter queries and
excluding them
- Support for running Solr with Java 10
- A new contrib/ltr NeuralNetworkModel class

Furthermore, this release includes Apache Lucene 7.3.0 which includes
several changes since the 7.2.0 release

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/solr/7.3.0

Please read CHANGES.txt for a detailed list of changes:

https://lucene.apache.org/solr/7_3_0/changes/Changes.html

Please report any feedback to the mailing lists
(http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.

Re: ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Shawn Heisey

On 4/4/2018 12:13 PM, Doug Turnbull wrote:
> Thanks for the responses. Yeah I thought they were weird errors too... :)
>
> Below are the logs from zookeeper running in foreground after a connection
> attempt. But this Exception looks suspicous to me:
>
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception
> causing close of session 0x10024db7e280006: *Len error 5327937*

With that information, I think I can tell you what went wrong.

It looks like one of the files you're trying to upload is 5 megabytes in
size.  ZooKeeper doesn't allow anything bigger than about 1 megabyte by
default, because is not designed for handling large amounts of data.

I think that the ZK uploading functionality probably needs to check the
size of what it is uploading against the max buffer setting and log a
useful error message.

You can get this to work, but to do so will require setting a system
property on *all* ZK clients and servers.  The clients will include Solr
itself and the zkcli script.  The system property to set is
"jute.maxbuffer".  Info can be found in ZK documentation.

https://zookeeper.apache.org/doc/r3.4.11/zookeeperAdmin.html

Thanks,
Shawn

Re: ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Susheel Kumar

Hi Doug,  are you able to connect to Zookeeper thru Zookeeper zkCli.sh or
does Zookeeper.out show anything useful.

Thnx

On Wed, Apr 4, 2018 at 2:13 PM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Thanks for the responses. Yeah I thought they were weird errors too... :)
>
> Below are the logs from zookeeper running in foreground after a connection
> attempt. But this Exception looks suspicous to me:
>
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception
> causing close of session 0x10024db7e280006: *Len error 5327937*
>
> Has anyone seen this before? The LenError seems to be a thread to google...
>
> 2018-04-04 14:06:01,210 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket
> connection
> from /127.0.0.1:55078
> 2018-04-04 14:06:01,218 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - Client attempting to establish
> new session at /127.0.0.1:55078
> 2018-04-04 14:06:01,219 [myid:] - INFO  [SyncThread:0:ZooKeeperServer@683]
> - Established session 0x10024db7e280006 with negotiated timeout 3 for
> client /127.0.0.1:55078
> 2018-04-04 14:06:01,361 [myid:] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception causing close of
> session 0x10024db7e280006: Len error 5327937
> 2018-04-04 14:06:01,362 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for
> client /127.0.0.1:55078 which had sessionid 0x10024db7e280006
> 2018-04-04 14:06:01,956 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket
> connection
> from /0:0:0:0:0:0:0:1:55079
> 2018-04-04 14:06:01,959 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@931] - Client attempting to renew
> session 0x10024db7e280006 at /0:0:0:0:0:0:0:1:55079
> 2018-04-04 14:06:01,960 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@683] - Established session
> 0x10024db7e280006 with negotiated timeout 3 for client
> /0:0:0:0:0:0:0:1:55079
> 2018-04-04 14:06:03,223 [myid:] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data
> from client sessionid 0x10024db7e280006, likely client has closed socket
> 2018-04-04 14:06:03,223 [myid:] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for
> client /0:0:0:0:0:0:0:1:55079 which had sessionid 0x10024db7e2800
>
> On Wed, Apr 4, 2018 at 11:15 AM Shawn Heisey  wrote:
>
> > On 4/4/2018 7:14 AM, Doug Turnbull wrote:
> > > I've been struggling to do a basic upconfig both with embedded and
> actual
> > > Zookeeper in Solr 7.2.1 using the zkcli script on OSX.
> > >
> > > One variable, I recently upgraded to Java 9. I get slightly different
> > > errors on Java 8 vs 9
> >
> > 
> >
> > > Java 9:
> > >
> > > doug@wiz$~/ws/foo(mas) $
> > > /Users/doug/bin/solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh
> -zkhost
> > > localhost:2181 -cmd upconfig -confdir solr_home/foo/ -confname foo_conf
> > > WARN  - 2018-04-04 09:05:28.194;
> > > org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004
> for
> > > server localhost/127.0.0.1:2181, unexpected error, closing socket
> > > connection and attempting reconnect
> > > java.io.IOException: Connection reset by peer
> >
> > 
> >
> > > Java 8 gives the error
> > >
> > > java.io.IOException: Protocol wrong type for socket
> > >
> > > WARN  - 2018-04-04 09:10:11.879;
> > > org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002
> for
> > > server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing
> socket
> > > connection and attempting reconnect
> > > java.io.IOException: Protocol wrong type for socket
> >
> > I'm with Erick on this.  These are REALLY weird errors. The stacktraces
> > for the errors are entirely in ZooKeeper and Java code, not Solr code.
> > The log for Java 9 does have an entry that mentions Solr classes, but
> > that's a disconnect after the error, not part of the error.
> >
> > Are you getting any corresponding log messages in the ZK server log?
> >
> > The ZkCLI class is part of Solr, and does interface to ZK through Solr
> > internals, but ultimately it's ZK doing the work.
> >
> > The ZK client that's in Solr 7.2.1 is version 3.4.10.
> >
> > Thanks,
> > Shawn
> >
> > --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>

Re: ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Doug Turnbull

Thanks for the responses. Yeah I thought they were weird errors too... :)

Below are the logs from zookeeper running in foreground after a connection
attempt. But this Exception looks suspicous to me:

[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception
causing close of session 0x10024db7e280006: *Len error 5327937*

Has anyone seen this before? The LenError seems to be a thread to google...

2018-04-04 14:06:01,210 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection
from /127.0.0.1:55078
2018-04-04 14:06:01,218 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - Client attempting to establish
new session at /127.0.0.1:55078
2018-04-04 14:06:01,219 [myid:] - INFO  [SyncThread:0:ZooKeeperServer@683]
- Established session 0x10024db7e280006 with negotiated timeout 3 for
client /127.0.0.1:55078
2018-04-04 14:06:01,361 [myid:] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception causing close of
session 0x10024db7e280006: Len error 5327937
2018-04-04 14:06:01,362 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for
client /127.0.0.1:55078 which had sessionid 0x10024db7e280006
2018-04-04 14:06:01,956 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection
from /0:0:0:0:0:0:0:1:55079
2018-04-04 14:06:01,959 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@931] - Client attempting to renew
session 0x10024db7e280006 at /0:0:0:0:0:0:0:1:55079
2018-04-04 14:06:01,960 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@683] - Established session
0x10024db7e280006 with negotiated timeout 3 for client
/0:0:0:0:0:0:0:1:55079
2018-04-04 14:06:03,223 [myid:] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data
from client sessionid 0x10024db7e280006, likely client has closed socket
2018-04-04 14:06:03,223 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for
client /0:0:0:0:0:0:0:1:55079 which had sessionid 0x10024db7e2800

On Wed, Apr 4, 2018 at 11:15 AM Shawn Heisey  wrote:

> On 4/4/2018 7:14 AM, Doug Turnbull wrote:
> > I've been struggling to do a basic upconfig both with embedded and actual
> > Zookeeper in Solr 7.2.1 using the zkcli script on OSX.
> >
> > One variable, I recently upgraded to Java 9. I get slightly different
> > errors on Java 8 vs 9
>
> 
>
> > Java 9:
> >
> > doug@wiz$~/ws/foo(mas) $
> > /Users/doug/bin/solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh -zkhost
> > localhost:2181 -cmd upconfig -confdir solr_home/foo/ -confname foo_conf
> > WARN  - 2018-04-04 09:05:28.194;
> > org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004 for
> > server localhost/127.0.0.1:2181, unexpected error, closing socket
> > connection and attempting reconnect
> > java.io.IOException: Connection reset by peer
>
> 
>
> > Java 8 gives the error
> >
> > java.io.IOException: Protocol wrong type for socket
> >
> > WARN  - 2018-04-04 09:10:11.879;
> > org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002 for
> > server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing socket
> > connection and attempting reconnect
> > java.io.IOException: Protocol wrong type for socket
>
> I'm with Erick on this.  These are REALLY weird errors. The stacktraces
> for the errors are entirely in ZooKeeper and Java code, not Solr code.
> The log for Java 9 does have an entry that mentions Solr classes, but
> that's a disconnect after the error, not part of the error.
>
> Are you getting any corresponding log messages in the ZK server log?
>
> The ZkCLI class is part of Solr, and does interface to ZK through Solr
> internals, but ultimately it's ZK doing the work.
>
> The ZK client that's in Solr 7.2.1 is version 3.4.10.
>
> Thanks,
> Shawn
>
> --
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: some parent documents

2018-04-04 Thread Mikhail Khludnev

>
> What's happening under the hood of
> solr in answering query [1] from [2]?

https://github.com/apache/lucene-solr/blob/master/lucene/join/src/java/org/apache/lucene/search/join/ToParentBlockJoinQuery.java#L178

On Wed, Apr 4, 2018 at 3:39 PM, Arturas Mazeika  wrote:

> Hi Mikhail et al,
>
> Thanks a lot for a very thorough answer. This is an impressive piece of
> knowledge you just shared.
>
> Not surprisingly, I was caught unprepared by the 'v=...' part of the
> answer. This brought me to the links you posted (starts with http). From
> those links I went to the more updated link (starts with https), which
> brought me to other very resourceful links. Combined with some meditation
> session, it came into my mind that it is not possible to express block
> queries using mathematical logic only. The format of the input document is
> deeply built into the query expression and answering. Expressing these
> queries mathematically / logically may give an impression that solr is
> capable of answering (NP-?) hard problems. I have a feeling though that
> solr answers to queries in polynomial (or even almost linear) times.
>
> Just to connect the remaining dots.. What's happening under the hood of
> solr in answering query [1] from [2]? Is it really so that inverted index
> is used to identify the vectors of ids, that are scanned linearly in a hope
> to get matches on _root_ and other internal variables?
>
> [1] q=+{!parent which=type_s:product v=$skuq} +{!parent
> which=type_s:product v=$vendorq}=+COLOR_s:Blue +SIZE_s:XL +{!parent
> which=type_s:sku v='+QTY_i:[10 TO *] +STATE_s:CA'}=+NAME_s:Bob
> +PRICE_i:[20 TO 25]
> [2]
> https://blog.griddynamics.com/searching-grandchildren-and-
> siblings-with-solr-block-join/
>
> Thanks!
> Arturas
>
> On Wed, Apr 4, 2018 at 12:36 PM, Mikhail Khludnev  wrote:
>
> > q=+{!parent which=ntype:p v='+msg:Hello +person:Arturas'} +{!parent
> which=
> > ntype:p v='+msg:ciao +person:Vai'}
> >
> > On Wed, Apr 4, 2018 at 12:19 PM, Arturas Mazeika 
> > wrote:
> >
> > > Hi Mikhail et al,
> > >
> > > It seems to me that the nested documents must include nodes that encode
> > the
> > > level of nodes (within the document). Therefore, the minimal example
> must
> > > include the node type. Is the following structure sufficient?
> > >
> > > {
> > > "id":1,
> > > "ntype":"p",
> > > "_childDocuments_":
> > > [
> > > {"id":"1_1", "ntype":"c", "person":"Vai", "time":"3:14",
> > > "msg":"Hello"},
> > > {"id":"1_2", "ntype":"c", "person":"Arturas", "time":"3:14",
> > > "msg":"Hello"},
> > > {"id":"1_3", "ntype":"c", "person":"Vai", "time":"3:15",
> > > "msg":"Coz Mathias is working on another system- different screen."},
> > > {"id":"1_4", "ntype":"c", "person":"Vai", "time":"3:15",
> > > "msg":"It can get annoying"},
> > > {"id":"1_5", "ntype":"c", "person":"Arturas", "time":"3:15",
> > > "msg":"Thank you. this is very nice of you"},
> > > {"id":"1_6", "ntype":"c", "person":"Vai", "time":"3:16",
> > > "msg":"ciao"},
> > > {"id":"1_7", "ntype":"c", "person":"Arturas", "time":"3:16",
> > > "msg":"ciao"}
> > > ]
> > > },
> > > {
> > > "id":2,
> > > "ntype":"p",
> > > "_childDocuments_":
> > > [
> > > {"id":"2_1", "ntype":"c", "person":"Vai", "time":"4:14",
> > > "msg":"Hi"},
> > > {"id":"2_2", "ntype":"c", "person":"Arturas", "time":"4:14",
> > > "msg":"IBM Watson"},
> > > {"id":"2_3", "ntype":"c", "person":"Vai", "time":"4:15",
> > > "msg":"need to retain content"},
> > > {"id":"2_4", "ntype":"c", "person":"Vai", "time":"4:15",
> > > "msg":"It can get annoying"},
> > > {"id":"2_5", "ntype":"c", "person":"Arturas", "time":"4:15",
> > > "msg":"You can make all your meetings more access"},
> > > {"id":"2_6", "ntype":"c", "person":"Vai", "time":"4:16",
> > > "msg":"Make every meeting a Skype meeting"},
> > > {"id":"2_7", "ntype":"c", "person":"Arturas", "time":"4:16",
> > > "msg":"ciao"}
> > > ]
> > > }
> > >
> > > How would a query look like that has a Hello from Person Arturas and
> ciao
> > > from Person Vai?
> > >
> > > Cheers,
> > > Arturas
> > >
> > >
> > > On Tue, Apr 3, 2018 at 5:21 PM, Arturas Mazeika 
> > wrote:
> > >
> > > > Hi Mikhail,
> > > >
> > > > Thanks a lot for the reply.
> > > >
> > > > You mentioned that
> > > >
> > > > q=+{!parent which.. v='+text:hello +person:A'} +{!parent
> > > > which..v='+text:ciao +person:B'}
> > > >
> > > > is the way to go. How would it look like precisely for the following
> > > > collection?
> > > >
> > > > {
> > > > "id":1,
> > > > "_childDocuments_":
> > > > [
> > > > {"id":"1_1", "person":"Vai" , "time":"3:14",
> > > > "msg":"Hello"},
> > > > {"id":"1_2", "person":"Arturas" , "time":"3:14",
> > > > "msg":"Hello"},
> > > > {"id":"1_3", "person":"Vai"

Re: Trying to Restore older indexes in Solr7.2.1

2018-04-04 Thread Shawn Heisey


On 4/4/2018 4:31 AM, Mugdha Varadkar wrote:

The hash ranges are available in each collections state.json file. Here is
the data for the collections:


I thought you said these indexes were sharded?  That only shows one 
shard.  It has the full hash range.  If they're all like that, then 
don't worry about hash ranges matching up.



Most schema changes require rebuilding the index from scratch.

So the solution so far is to export the documents of older collection after
upgrading to solr 7.2.1 import them on to new collection.


If you're using an export/import process, you probably want to perform 
the export on the old version, not the new version.



Are you getting any errors when you query the index?  Look in solr.log.

Not yet.


Exactly what are you doing to export your data?  What does the export 
look like?  How are you importing it?  Did you commit the changes after 
importing?


Thanks,
Shawn

Re: ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Shawn Heisey


On 4/4/2018 7:14 AM, Doug Turnbull wrote:

I've been struggling to do a basic upconfig both with embedded and actual
Zookeeper in Solr 7.2.1 using the zkcli script on OSX.

One variable, I recently upgraded to Java 9. I get slightly different
errors on Java 8 vs 9





Java 9:

doug@wiz$~/ws/foo(mas) $
/Users/doug/bin/solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh -zkhost
localhost:2181 -cmd upconfig -confdir solr_home/foo/ -confname foo_conf
WARN  - 2018-04-04 09:05:28.194;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004 for
server localhost/127.0.0.1:2181, unexpected error, closing socket
connection and attempting reconnect
java.io.IOException: Connection reset by peer





Java 8 gives the error

java.io.IOException: Protocol wrong type for socket

WARN  - 2018-04-04 09:10:11.879;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002 for
server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing socket
connection and attempting reconnect
java.io.IOException: Protocol wrong type for socket


I'm with Erick on this.  These are REALLY weird errors. The stacktraces 
for the errors are entirely in ZooKeeper and Java code, not Solr code.  
The log for Java 9 does have an entry that mentions Solr classes, but 
that's a disconnect after the error, not part of the error.


Are you getting any corresponding log messages in the ZK server log?

The ZkCLI class is part of Solr, and does interface to ZK through Solr 
internals, but ultimately it's ZK doing the work.


The ZK client that's in Solr 7.2.1 is version 3.4.10.

Thanks,
Shawn

Re: ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Erick Erickson

I haven't seen those errors before, so it's puzzling. Is there
any chance there are conflicting _zookeeper_ jars somewhere
in your classpath? This looks like a problem with ZK talking
to itself.

You may find it easier just to use the bin/solr script, we tried
to put useful ZK commands there. Whether that would
get around your error I don't know. The command you
used _should_ work.

"bin/solr zk -help"

Best,
Erick

On Wed, Apr 4, 2018 at 6:14 AM, Doug Turnbull
 wrote:
> I've been struggling to do a basic upconfig both with embedded and actual
> Zookeeper in Solr 7.2.1 using the zkcli script on OSX.
>
> One variable, I recently upgraded to Java 9. I get slightly different
> errors on Java 8 vs 9
>
> This is probably me being dumb, but googling / searching Jira hasn't really
> yielded anything fruitful. Perhaps my google fu is weak this morning.
>
> Thanks for any help
> -Doug
>
>
>
> Java 9:
>
> doug@wiz$~/ws/foo(mas) $
> /Users/doug/bin/solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh -zkhost
> localhost:2181 -cmd upconfig -confdir solr_home/foo/ -confname foo_conf
> WARN  - 2018-04-04 09:05:28.194;
> org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004 for
> server localhost/127.0.0.1:2181, unexpected error, closing socket
> connection and attempting reconnect
> java.io.IOException: Connection reset by peer
> at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
> WARN  - 2018-04-04 09:05:28.310;
> org.apache.solr.common.cloud.ConnectionManager; Watcher
> org.apache.solr.common.cloud.ConnectionManager@3eed2369 name:
> ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent
> state:Disconnected type:None path:null path: null type: None
>
> Java 8 gives the error
>
> java.io.IOException: Protocol wrong type for socket
>
> WARN  - 2018-04-04 09:10:11.879;
> org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002 for
> server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing socket
> connection and attempting reconnect
> java.io.IOException: Protocol wrong type for socket
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
>
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug

Re: SolrCloud 7.2 problem with leader election

2018-04-04 Thread Gael Jourdan-Weil

Using property legacyCloud=true, coreNodeNames are well written by Solr 
in core.properties file.


We are wondering if the problem comes from our configuration or the 
bugfix https://issues.apache.org/jira/browse/SOLR-11503 ?



_*Without legacyCloud=true:*_

> Our configuration before Solr start:

.../cores/fr_green/core.properties on both hosts:

    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green

> Our configuration after Solr start:

.../cores/fr_green/core.properties host1:

    #Written by CorePropertiesLocator
    #Wed Apr 04 13:41:10 UTC 2018
    name=fr_green
    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green
    coreNodeName=core_node1

.../cores/fr_green/core.properties host2:

    #Written by CorePropertiesLocator
    #Wed Apr 04 13:41:10 UTC 2018
    name=fr_green
    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green
    coreNodeName=core_node1

--

_*With legacyCloud=true:*_

> Our configuration before Solr start:

.../cores/fr_green/core.properties on both hosts:

    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green

> Our configuration after Solr start:

.../cores/fr_green/core.properties host1:

    #Written by CorePropertiesLocator
    #Wed Apr 04 13:41:10 UTC 2018
    name=fr_green
    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green
    coreNodeName=core_node1

.../cores/fr_green/core.properties host2:

    #Written by CorePropertiesLocator
    #Wed Apr 04 13:41:10 UTC 2018
    name=fr_green
    shard=shard1
    dataDir=/opt/kookel/data/searchSolrNode/solrindex/fr_green
    coreNodeName=core_node2

=> coreNodeName for host2 is correct





On 03/04/18 15:46, Gael Jourdan-Weil wrote:

Hello,

We are trying to upgrade from Solr 6.6 to Solr 7.2.1 and we are using Solr 
Cloud.

Doing some tests with 2 replicas, ZooKeeper doesn't know which one to elect as 
a leader:

ERROR org.apache.solr.cloud.ZkController:getLeader:1206  - Error getting leader 
from zk
org.apache.solr.common.SolrException: There is conflicting information about 
the leader of shard: shard1 our state 
says:http://host1:8080/searchsolrnodefr/fr_blue/ but zookeeper 
says:http://host2:8080/searchsolrnodefr/fr_blue/

solr.xml file:
${genericCoreNodeNames:true}

In the core.properties files, each replica has the same coreNodeName value: 
"core_node1".
When changing this property on host2 with value "core_node2", ZooKeeper can 
elect a leader and everything is fine.

Do we need to set genericCoreNodeNames to false in solr.xml ?

Gaël

Re: Solr cloud schema and schemaless

2018-04-04 Thread Kojo

Many thanks Erick.
I think that we found the issue regarding schemaless. The origin file has
to follow a specific format and we were trying to index with a non solr xml
standard.

Also, thanks for the advice of the field type. These schemaless collections
will all come from the same source, hence normalized. I hope...

Koji



2018-04-04 0:57 GMT-03:00 Erick Erickson :

> The schema mode is _per collection_, not per node. So there's no trouble
> mixing
> replicas from collection A running schema model 1 with replicas from
> collection B
> running a different schema model.
>
> That said, schemaless is _not_ recommended for production unless you have
> total control over the ETL chain and can guarantee that documents conform
> to
> some standard. Schemaless does its best, but it guesses based on the
> first time it
> sees a field. So if the first doc has field X with a value of 1, it
> infers that this field
> is an int type. If doc2 has a value of 1.0, the doc fails with a parsing
> error.
>
> FYI,
> Erick
>
> On Tue, Apr 3, 2018 at 2:39 PM, Kojo  wrote:
> > Hi Solrs,
> > We have a Solr cloud running in three nodes.
> > Five collections are running in schema mode and we would like to create
> > another collection running schemalles.
> >
> > Does it fit all together schema and schemales on the same nodes?
> >
> > I am not sure, because on this page it starts solr in schemalles mode
> but I
> > start Solr cloud whithout this option.
> >
> > https://lucene.apache.org/solr/guide/6_6/schemaless-mode.html
> >
> > bin/solr start -e schemaless
> >
> >
> >
> > Thank you all!
>

ZK CLI script giving IOException doing upconfig

2018-04-04 Thread Doug Turnbull

I've been struggling to do a basic upconfig both with embedded and actual
Zookeeper in Solr 7.2.1 using the zkcli script on OSX.

One variable, I recently upgraded to Java 9. I get slightly different
errors on Java 8 vs 9

This is probably me being dumb, but googling / searching Jira hasn't really
yielded anything fruitful. Perhaps my google fu is weak this morning.

Thanks for any help
-Doug



Java 9:

doug@wiz$~/ws/foo(mas) $
/Users/doug/bin/solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh -zkhost
localhost:2181 -cmd upconfig -confdir solr_home/foo/ -confname foo_conf
WARN  - 2018-04-04 09:05:28.194;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004 for
server localhost/127.0.0.1:2181, unexpected error, closing socket
connection and attempting reconnect
java.io.IOException: Connection reset by peer
at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:192)
at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
WARN  - 2018-04-04 09:05:28.310;
org.apache.solr.common.cloud.ConnectionManager; Watcher
org.apache.solr.common.cloud.ConnectionManager@3eed2369 name:
ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent
state:Disconnected type:None path:null path: null type: None

Java 8 gives the error

java.io.IOException: Protocol wrong type for socket

WARN  - 2018-04-04 09:10:11.879;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002 for
server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing socket
connection and attempting reconnect
java.io.IOException: Protocol wrong type for socket
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)

-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: some parent documents

2018-04-04 Thread Arturas Mazeika

Hi Mikhail et al,

Thanks a lot for a very thorough answer. This is an impressive piece of
knowledge you just shared.

Not surprisingly, I was caught unprepared by the 'v=...' part of the
answer. This brought me to the links you posted (starts with http). From
those links I went to the more updated link (starts with https), which
brought me to other very resourceful links. Combined with some meditation
session, it came into my mind that it is not possible to express block
queries using mathematical logic only. The format of the input document is
deeply built into the query expression and answering. Expressing these
queries mathematically / logically may give an impression that solr is
capable of answering (NP-?) hard problems. I have a feeling though that
solr answers to queries in polynomial (or even almost linear) times.

Just to connect the remaining dots.. What's happening under the hood of
solr in answering query [1] from [2]? Is it really so that inverted index
is used to identify the vectors of ids, that are scanned linearly in a hope
to get matches on _root_ and other internal variables?

[1] q=+{!parent which=type_s:product v=$skuq} +{!parent
which=type_s:product v=$vendorq}=+COLOR_s:Blue +SIZE_s:XL +{!parent
which=type_s:sku v='+QTY_i:[10 TO *] +STATE_s:CA'}=+NAME_s:Bob
+PRICE_i:[20 TO 25]
[2]
https://blog.griddynamics.com/searching-grandchildren-and-siblings-with-solr-block-join/

Thanks!
Arturas

On Wed, Apr 4, 2018 at 12:36 PM, Mikhail Khludnev  wrote:

> q=+{!parent which=ntype:p v='+msg:Hello +person:Arturas'} +{!parent which=
> ntype:p v='+msg:ciao +person:Vai'}
>
> On Wed, Apr 4, 2018 at 12:19 PM, Arturas Mazeika 
> wrote:
>
> > Hi Mikhail et al,
> >
> > It seems to me that the nested documents must include nodes that encode
> the
> > level of nodes (within the document). Therefore, the minimal example must
> > include the node type. Is the following structure sufficient?
> >
> > {
> > "id":1,
> > "ntype":"p",
> > "_childDocuments_":
> > [
> > {"id":"1_1", "ntype":"c", "person":"Vai", "time":"3:14",
> > "msg":"Hello"},
> > {"id":"1_2", "ntype":"c", "person":"Arturas", "time":"3:14",
> > "msg":"Hello"},
> > {"id":"1_3", "ntype":"c", "person":"Vai", "time":"3:15",
> > "msg":"Coz Mathias is working on another system- different screen."},
> > {"id":"1_4", "ntype":"c", "person":"Vai", "time":"3:15",
> > "msg":"It can get annoying"},
> > {"id":"1_5", "ntype":"c", "person":"Arturas", "time":"3:15",
> > "msg":"Thank you. this is very nice of you"},
> > {"id":"1_6", "ntype":"c", "person":"Vai", "time":"3:16",
> > "msg":"ciao"},
> > {"id":"1_7", "ntype":"c", "person":"Arturas", "time":"3:16",
> > "msg":"ciao"}
> > ]
> > },
> > {
> > "id":2,
> > "ntype":"p",
> > "_childDocuments_":
> > [
> > {"id":"2_1", "ntype":"c", "person":"Vai", "time":"4:14",
> > "msg":"Hi"},
> > {"id":"2_2", "ntype":"c", "person":"Arturas", "time":"4:14",
> > "msg":"IBM Watson"},
> > {"id":"2_3", "ntype":"c", "person":"Vai", "time":"4:15",
> > "msg":"need to retain content"},
> > {"id":"2_4", "ntype":"c", "person":"Vai", "time":"4:15",
> > "msg":"It can get annoying"},
> > {"id":"2_5", "ntype":"c", "person":"Arturas", "time":"4:15",
> > "msg":"You can make all your meetings more access"},
> > {"id":"2_6", "ntype":"c", "person":"Vai", "time":"4:16",
> > "msg":"Make every meeting a Skype meeting"},
> > {"id":"2_7", "ntype":"c", "person":"Arturas", "time":"4:16",
> > "msg":"ciao"}
> > ]
> > }
> >
> > How would a query look like that has a Hello from Person Arturas and ciao
> > from Person Vai?
> >
> > Cheers,
> > Arturas
> >
> >
> > On Tue, Apr 3, 2018 at 5:21 PM, Arturas Mazeika 
> wrote:
> >
> > > Hi Mikhail,
> > >
> > > Thanks a lot for the reply.
> > >
> > > You mentioned that
> > >
> > > q=+{!parent which.. v='+text:hello +person:A'} +{!parent
> > > which..v='+text:ciao +person:B'}
> > >
> > > is the way to go. How would it look like precisely for the following
> > > collection?
> > >
> > > {
> > > "id":1,
> > > "_childDocuments_":
> > > [
> > > {"id":"1_1", "person":"Vai" , "time":"3:14",
> > > "msg":"Hello"},
> > > {"id":"1_2", "person":"Arturas" , "time":"3:14",
> > > "msg":"Hello"},
> > > {"id":"1_3", "person":"Vai" , "time":"3:15", "msg":"Coz
> > > Mathias is working on another system- different screen."},
> > > {"id":"1_4", "person":"Vai" , "time":"3:15", "msg":"It
> > can
> > > get annoying"},
> > > {"id":"1_5", "person":"Arturas" , "time":"3:15",
> "msg":"Thank
> > > you. this is very nice of you"},
> > > {"id":"1_6", "person":"Vai" , "time":"3:16",
> > "msg":"ciao"},
> > > {"id":"1_7", "person":"Arturas" , "time":"3:16",
> > "msg":"ciao"}
> > > ]
> > > },
> > > {
> >

Re: some parent documents

2018-04-04 Thread Mikhail Khludnev

q=+{!parent which=ntype:p v='+msg:Hello +person:Arturas'} +{!parent which=
ntype:p v='+msg:ciao +person:Vai'}

On Wed, Apr 4, 2018 at 12:19 PM, Arturas Mazeika  wrote:

> Hi Mikhail et al,
>
> It seems to me that the nested documents must include nodes that encode the
> level of nodes (within the document). Therefore, the minimal example must
> include the node type. Is the following structure sufficient?
>
> {
> "id":1,
> "ntype":"p",
> "_childDocuments_":
> [
> {"id":"1_1", "ntype":"c", "person":"Vai", "time":"3:14",
> "msg":"Hello"},
> {"id":"1_2", "ntype":"c", "person":"Arturas", "time":"3:14",
> "msg":"Hello"},
> {"id":"1_3", "ntype":"c", "person":"Vai", "time":"3:15",
> "msg":"Coz Mathias is working on another system- different screen."},
> {"id":"1_4", "ntype":"c", "person":"Vai", "time":"3:15",
> "msg":"It can get annoying"},
> {"id":"1_5", "ntype":"c", "person":"Arturas", "time":"3:15",
> "msg":"Thank you. this is very nice of you"},
> {"id":"1_6", "ntype":"c", "person":"Vai", "time":"3:16",
> "msg":"ciao"},
> {"id":"1_7", "ntype":"c", "person":"Arturas", "time":"3:16",
> "msg":"ciao"}
> ]
> },
> {
> "id":2,
> "ntype":"p",
> "_childDocuments_":
> [
> {"id":"2_1", "ntype":"c", "person":"Vai", "time":"4:14",
> "msg":"Hi"},
> {"id":"2_2", "ntype":"c", "person":"Arturas", "time":"4:14",
> "msg":"IBM Watson"},
> {"id":"2_3", "ntype":"c", "person":"Vai", "time":"4:15",
> "msg":"need to retain content"},
> {"id":"2_4", "ntype":"c", "person":"Vai", "time":"4:15",
> "msg":"It can get annoying"},
> {"id":"2_5", "ntype":"c", "person":"Arturas", "time":"4:15",
> "msg":"You can make all your meetings more access"},
> {"id":"2_6", "ntype":"c", "person":"Vai", "time":"4:16",
> "msg":"Make every meeting a Skype meeting"},
> {"id":"2_7", "ntype":"c", "person":"Arturas", "time":"4:16",
> "msg":"ciao"}
> ]
> }
>
> How would a query look like that has a Hello from Person Arturas and ciao
> from Person Vai?
>
> Cheers,
> Arturas
>
>
> On Tue, Apr 3, 2018 at 5:21 PM, Arturas Mazeika  wrote:
>
> > Hi Mikhail,
> >
> > Thanks a lot for the reply.
> >
> > You mentioned that
> >
> > q=+{!parent which.. v='+text:hello +person:A'} +{!parent
> > which..v='+text:ciao +person:B'}
> >
> > is the way to go. How would it look like precisely for the following
> > collection?
> >
> > {
> > "id":1,
> > "_childDocuments_":
> > [
> > {"id":"1_1", "person":"Vai" , "time":"3:14",
> > "msg":"Hello"},
> > {"id":"1_2", "person":"Arturas" , "time":"3:14",
> > "msg":"Hello"},
> > {"id":"1_3", "person":"Vai" , "time":"3:15", "msg":"Coz
> > Mathias is working on another system- different screen."},
> > {"id":"1_4", "person":"Vai" , "time":"3:15", "msg":"It
> can
> > get annoying"},
> > {"id":"1_5", "person":"Arturas" , "time":"3:15", "msg":"Thank
> > you. this is very nice of you"},
> > {"id":"1_6", "person":"Vai" , "time":"3:16",
> "msg":"ciao"},
> > {"id":"1_7", "person":"Arturas" , "time":"3:16",
> "msg":"ciao"}
> > ]
> > },
> > {
> > "id":2,
> > "_childDocuments_":
> > [
> > {"id":"2_1", "person":"Vai" , "time":"4:14",
> > "msg":"Hello"},
> > {"id":"2_2", "person":"Arturas" , "time":"4:14", "msg":"IBM
> > Watson"},
> > {"id":"2_3", "person":"Vai" , "time":"4:15", "msg":"need
> > to retain content"},
> > {"id":"2_4", "person":"Vai" , "time":"4:15", "msg":"It
> can
> > get annoying"},
> > {"id":"2_5", "person":"Arturas" , "time":"4:15", "msg":"You
> > can make all your meetings more access"},
> > {"id":"2_6", "person":"Vai" , "time":"4:16", "msg":"Make
> > every meeting a Skype meeting"},
> > {"id":"2_7", "person":"Arturas" , "time":"4:16",
> "msg":"ciao"}
> > ]
> > }
> >
> > Cheers,
> > Arturas
> >
> >
> > On Tue, Apr 3, 2018 at 4:33 PM, Mikhail Khludnev 
> wrote:
> >
> >> Hello, Arturas.
> >>
> >> TLDR; Please find inline below.
> >>
> >> On Tue, Apr 3, 2018 at 5:14 PM, Arturas Mazeika 
> >> wrote:
> >>
> >> > Hi Solr Fans,
> >> >
> >> > I am trying to make sense of information retrieval using expressions
> >> like
> >> > "some parent", "*only parent*", " *all parent*". I am also trying to
> >> > understand the syntax "!parent which" and "!child of". On the
> technical
> >> > level, I am reading the following documents:
> >> >
> >> > [1]
> >> > https://lucene.apache.org/solr/guide/7_2/other-parsers.
> >> > html#block-join-query-parsers
> >> > [2]
> >> > https://lucene.apache.org/solr/guide/7_2/uploading-data-
> >> > with-index-handlers.html#nested-child-documents
> >> > [3] http://yonik.com/solr-nested-objects/
> >> >
> >> > and I am confused to read:
> >> >
>

Re: Trying to Restore older indexes in Solr7.2.1

2018-04-04 Thread Mugdha Varadkar

Hi Shawn,

The hash ranges are available in each collections state.json file. Here is
the data for the collections:

*Solr 5.5.5 collection*

{"ranger_audits":{
"replicationFactor":"1",
"router":{"name":"compositeId"},
"maxShardsPerNode":"1",
"autoAddReplicas":"false",
"shards":{"shard1":{
"range":"8000-7fff",
"state":"active",
"replicas":{"core_node1":{
"core":"ranger_audits_shard1_replica1",
"base_url":"http://127.0.0.1:8886/solr;,
"node_name":"127.0.0.1:8886_solr",
"state":"active",
"leader":"true"}}


*Solr 7.2.1 collection*

{"ranger_audits":{
"pullReplicas":"0",
"replicationFactor":"1",
"shards":{"shard1":{
"range":"8000-7fff",
"state":"active",
"replicas":{"core_node2":{
"core":"ranger_audits_shard1_replica_n1",
"base_url":"http://localhost:8886/solr;,
"node_name":"localhost:8886_solr",
"state":"active",
"type":"NRT",
"leader":"true",
"router":{"name":"compositeId"},
"maxShardsPerNode":"1",
"autoAddReplicas":"false",
"nrtReplicas":"1",
"tlogReplicas":"0"}}


Most schema changes require rebuilding the index from scratch.

So the solution so far is to export the documents of older collection after
upgrading to solr 7.2.1 import them on to new collection.


Are you getting any errors when you query the index?  Look in solr.log.

Not yet.

Thanks,
Mugdha Varadkar

On Tue, Apr 3, 2018 at 5:58 PM, Shawn Heisey  wrote:

> On 4/3/2018 3:22 AM, Mugdha Varadkar wrote:
>
>>
>> is the collection using the compositeId router?
>>
>> Yes collection used of both the versions are using compositeId router,
>> PFA screenshot of the same.
>>
>
> If you attached a screenshot, it was lost.  The mailing list does not
> allow most attachments through.
>
> You would need to look at the information in zookeeper for both
>> versions
>>
>> Any specific information I can check / look into: shall I check for
>> uploaded configs for the collection or any specific set of properties in
>> solrconfig.xml?
>>
>
> It's not in solrconfig.xml.  I think you'll find the hash ranges for your
> shards in the '/collections/NAME/state.json' file for the collection, in
> zookeeper.  I'm not 100% sure that this is the correct location, but I
> *think* it is.
>
> The difference between schema of solr-5.5.5 and schema that is there for
>> solr-7.2.1 is that, we are adding docValues on required fields to improvise
>> indexing of required fields.
>> Hence before upgrading from solr-5.5.5, I took a backup of documents
>> using restore-api and then upgraded solr from 5.5.5 to 7.2.1 and tried to
>> restore the backed-up data by converting the documents to the newer format
>> using the index-upgrader tool.
>>
>
> IndexUpgrader is a Lucene tool.  The schema is a Solr invention.
> IndexUpgrader does not know anything about the schema.  It only knows about
> what's actually in the index.
>
> If you try to use a different schema to access your index than the schema
> it was created with, you run the risk that you won't be able to access your
> data.  Most schema changes require rebuilding the index from scratch.
> Adding docValues is a change that requires this.  Running indexUpgrader is
> *NOT* a reindex.
>
> And, issue I got after restoring the documents of older version was : all
>> the documents that were there in collection that was created for solr-7.2.1
>> were not available at all.
>>
>
> Are you getting any errors when you query the index?  Look in solr.log.
>
> Thanks,
> Shawn
>
>

Re: some parent documents

2018-04-04 Thread Arturas Mazeika

Hi Mikhail et al,

It seems to me that the nested documents must include nodes that encode the
level of nodes (within the document). Therefore, the minimal example must
include the node type. Is the following structure sufficient?

{
"id":1,
"ntype":"p",
"_childDocuments_":
[
{"id":"1_1", "ntype":"c", "person":"Vai", "time":"3:14",
"msg":"Hello"},
{"id":"1_2", "ntype":"c", "person":"Arturas", "time":"3:14",
"msg":"Hello"},
{"id":"1_3", "ntype":"c", "person":"Vai", "time":"3:15",
"msg":"Coz Mathias is working on another system- different screen."},
{"id":"1_4", "ntype":"c", "person":"Vai", "time":"3:15",
"msg":"It can get annoying"},
{"id":"1_5", "ntype":"c", "person":"Arturas", "time":"3:15",
"msg":"Thank you. this is very nice of you"},
{"id":"1_6", "ntype":"c", "person":"Vai", "time":"3:16",
"msg":"ciao"},
{"id":"1_7", "ntype":"c", "person":"Arturas", "time":"3:16",
"msg":"ciao"}
]
},
{
"id":2,
"ntype":"p",
"_childDocuments_":
[
{"id":"2_1", "ntype":"c", "person":"Vai", "time":"4:14",
"msg":"Hi"},
{"id":"2_2", "ntype":"c", "person":"Arturas", "time":"4:14",
"msg":"IBM Watson"},
{"id":"2_3", "ntype":"c", "person":"Vai", "time":"4:15",
"msg":"need to retain content"},
{"id":"2_4", "ntype":"c", "person":"Vai", "time":"4:15",
"msg":"It can get annoying"},
{"id":"2_5", "ntype":"c", "person":"Arturas", "time":"4:15",
"msg":"You can make all your meetings more access"},
{"id":"2_6", "ntype":"c", "person":"Vai", "time":"4:16",
"msg":"Make every meeting a Skype meeting"},
{"id":"2_7", "ntype":"c", "person":"Arturas", "time":"4:16",
"msg":"ciao"}
]
}

How would a query look like that has a Hello from Person Arturas and ciao
from Person Vai?

Cheers,
Arturas


On Tue, Apr 3, 2018 at 5:21 PM, Arturas Mazeika  wrote:

> Hi Mikhail,
>
> Thanks a lot for the reply.
>
> You mentioned that
>
> q=+{!parent which.. v='+text:hello +person:A'} +{!parent
> which..v='+text:ciao +person:B'}
>
> is the way to go. How would it look like precisely for the following
> collection?
>
> {
> "id":1,
> "_childDocuments_":
> [
> {"id":"1_1", "person":"Vai" , "time":"3:14",
> "msg":"Hello"},
> {"id":"1_2", "person":"Arturas" , "time":"3:14",
> "msg":"Hello"},
> {"id":"1_3", "person":"Vai" , "time":"3:15", "msg":"Coz
> Mathias is working on another system- different screen."},
> {"id":"1_4", "person":"Vai" , "time":"3:15", "msg":"It can
> get annoying"},
> {"id":"1_5", "person":"Arturas" , "time":"3:15", "msg":"Thank
> you. this is very nice of you"},
> {"id":"1_6", "person":"Vai" , "time":"3:16", "msg":"ciao"},
> {"id":"1_7", "person":"Arturas" , "time":"3:16", "msg":"ciao"}
> ]
> },
> {
> "id":2,
> "_childDocuments_":
> [
> {"id":"2_1", "person":"Vai" , "time":"4:14",
> "msg":"Hello"},
> {"id":"2_2", "person":"Arturas" , "time":"4:14", "msg":"IBM
> Watson"},
> {"id":"2_3", "person":"Vai" , "time":"4:15", "msg":"need
> to retain content"},
> {"id":"2_4", "person":"Vai" , "time":"4:15", "msg":"It can
> get annoying"},
> {"id":"2_5", "person":"Arturas" , "time":"4:15", "msg":"You
> can make all your meetings more access"},
> {"id":"2_6", "person":"Vai" , "time":"4:16", "msg":"Make
> every meeting a Skype meeting"},
> {"id":"2_7", "person":"Arturas" , "time":"4:16", "msg":"ciao"}
> ]
> }
>
> Cheers,
> Arturas
>
>
> On Tue, Apr 3, 2018 at 4:33 PM, Mikhail Khludnev  wrote:
>
>> Hello, Arturas.
>>
>> TLDR; Please find inline below.
>>
>> On Tue, Apr 3, 2018 at 5:14 PM, Arturas Mazeika 
>> wrote:
>>
>> > Hi Solr Fans,
>> >
>> > I am trying to make sense of information retrieval using expressions
>> like
>> > "some parent", "*only parent*", " *all parent*". I am also trying to
>> > understand the syntax "!parent which" and "!child of". On the technical
>> > level, I am reading the following documents:
>> >
>> > [1]
>> > https://lucene.apache.org/solr/guide/7_2/other-parsers.
>> > html#block-join-query-parsers
>> > [2]
>> > https://lucene.apache.org/solr/guide/7_2/uploading-data-
>> > with-index-handlers.html#nested-child-documents
>> > [3] http://yonik.com/solr-nested-objects/
>> >
>> > and I am confused to read:
>> >
>> > This parser takes a query that matches some parent documents and returns
>> > their children. The syntax for this parser is: q={!child
>> > of=}. The parameter allParents is a filter that
>> > matches *only parent documents*; here you would define the field and
>> value
>> > that you used to identify *all parent documents*. The parameter
>> someParents
>> > identifies a query that will match some of the parent documents. The
>> output
>> > is the children.
>> >
>>

PreAnalyzed URP and SchemaRequest API

2018-04-04 Thread Markus Jelsma

Hello,

We intend to move to PreAnalyzed URP for analysis offloading. Browsing the 
Javadocs i came across the SchemaRequest API looking for a way to get a Field 
object remotely, which i seem to need for 
JsonPreAnalyzedParser.toFormattedString(Field f). But all i can get from 
SchemaRequest API is FieldTypeRepresentation, which offers me 
getIndexAnalyzer() but won't allow me to construct a Field object.

So, to analyze remotely i do need an index-time analyzer. I can get it, but not 
turn it into a Field object, which the PreAnalyzedParser for some reason wants.

Any hints here? I must be looking the wrong way.

Many thanks!
Markus

ZKPropertiesWriter error DIH (SolrCloud 6.6.1)

2018-04-04 Thread msaunier

Hello,
I use Solr Cloud and I test DIH system in cloud, but I have this error :

Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to PropertyWriter implementation:ZKPropertiesWriter
at
org.apache.solr.handler.dataimport.DataImporter.createPropertyWriter(DataImp
orter.java:330)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja
va:411)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:474
)
at
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImport
er.java:457)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:935)
at
org.apache.solr.handler.dataimport.DataImporter.createPropertyWriter(DataImp
orter.java:326)
... 4 more

My DIH definition on the cloud


















Call response :
 

http://localhost:8983/solr/advertisements2/full-advertisements?command=full-
import=false=true



0
2


true
1

DIH/advertisements.xml


full-import
idle




I don't understand why I have this error. Can you help me ?
Thanks you.

Re: Classifier for query intent?

2018-04-04 Thread Georg Sorst

Hi wunder,

this sounds like an interesting topic. Can you elaborate a bit on query
intent classification? Where does the training data come from? Do you
manually assign an intent to a query or can this be done in a
(semi-)automatic way? Do you have a fixed list of possible intents
(something like Google has: informational, navigational, transactional)?

Any pointers to useful links or papers maybe?

Thanks!
Georg

Walter Underwood  schrieb am Di., 3. Apr. 2018 um
01:18 Uhr:

> We are experimenting with a text classifier for determining query intent.
> Anybody have a favorite (or anti-favorite) Java implementation? Speed and
> ease of implementation is important.
>
> Right now, we’re mostly looking at Weka and the Stanford Classifier.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>

Multi threaded document atomic OR in-place updates

2018-04-04 Thread pravesh

I have a scenario as follows:

There are 2 separate threads where each will try to update the same document
in a single index for 2 separate fields, for which we are using atomic OR
in-place updates. For e.g.

id is the unique field in the index

thread-1 will update following info:
id:1001
field-1:abc1001

thread-2 will update following info:
id:1001
field-2:xyz1002

The updates are done on the same core index asynchronously.
What i would need to know is that will there be at any time inconsistency in
the index. Both the 2 threads will update different fields for the same id
field.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

Support LTR RankQuery with Grouping

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

Re: Need help to get started on Solr, searching get nothing. Thank you very much in advance

Re: Largest number of indexed documents used by Solr

[ANNOUNCE] Apache Solr 7.3.0 released

Re: ZK CLI script giving IOException doing upconfig

Re: ZK CLI script giving IOException doing upconfig

Re: ZK CLI script giving IOException doing upconfig

Re: some parent documents

Re: Trying to Restore older indexes in Solr7.2.1

Re: ZK CLI script giving IOException doing upconfig

Re: ZK CLI script giving IOException doing upconfig

Re: SolrCloud 7.2 problem with leader election

Re: Solr cloud schema and schemaless

ZK CLI script giving IOException doing upconfig

Re: some parent documents

Re: some parent documents

Re: Trying to Restore older indexes in Solr7.2.1

Re: some parent documents

PreAnalyzed URP and SchemaRequest API

ZKPropertiesWriter error DIH (SolrCloud 6.6.1)

Re: Classifier for query intent?

Multi threaded document atomic OR in-place updates

25 matches

Site Navigation

Mail list logo

Footer information