Segment information gets deleted

2017-06-08 Thread Chetas Joshi
Hi,

I am trying to understand what the possible root causes for the
following exception could be.


java.io.FileNotFoundException: File does not exist:
hdfs://*/*/*/*/data/index/_2h.si


I had some long GC pauses while executing some queries which took some
of the replicas down. But how can that affect the segmentation
metadata of the solr indexes?


Thanks!


Re: Solr coreContainer shut down

2017-05-23 Thread Chetas Joshi
Okay. Thanks Shawn.

I am using Chef for deploying SolrCloud as a service. The chef-client runs
every 30 minutes and hence the script "install_solr_service" runs every 30
minutes. I changed that.

On Fri, May 19, 2017 at 5:20 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/19/2017 5:05 PM, Chetas Joshi wrote:
> > If I don't wanna upgrade and there is an already installed service, why
> > should it be exit 1 and not exit 0? Shouldn't it be like
> >
> > if [ ! "$SOLR_UPGRADE" = "YES" ]; then
> >
> >   if [ -f "/etc/init.d/$SOLR_SERVICE" ]; then
> >
> > print_usage "/etc/init.d/$SOLR_SERVICE already exists! Perhaps Solr
> is
> > already setup as a service on this host? To upgrade Solr use the -f
> option."
> >
> > *exit 0*
> >
> >   fi
>
> When the script reaches this point, the installation has failed, because
> the service already exists and the script wasn't asked to upgrade it.
> That is why it exits with a value of 1.  If it were to exit with 0,
> whatever called the script would assume that the installation was
> successful -- which is not what has happened.
>
> Why are you installing Solr again when it is already installed?
>
> Thanks,
> Shawn
>
>


Re: Solr coreContainer shut down

2017-05-19 Thread Chetas Joshi
I found the reason why this is happening!
I am using chef and running install_sol_service.sh with option -n -f. So,
every time chef-client runs it is stopping the already running solr
instance. Now, I have removed option -f (no upgrade) but running into an
error.

I have a question on the following peice of code.

if [ ! "$SOLR_UPGRADE" = "YES" ]; then

  if [ -f "/etc/init.d/$SOLR_SERVICE" ]; then

print_usage "/etc/init.d/$SOLR_SERVICE already exists! Perhaps Solr is
already setup as a service on this host? To upgrade Solr use the -f option."

exit 1

  fi


  if [ -e "$SOLR_EXTRACT_DIR/$SOLR_SERVICE" ]; then

print_usage "$SOLR_EXTRACT_DIR/$SOLR_SERVICE already exists! Please
move this directory / link or choose a different service name using the -s
option."

exit 1

  fi

fi


If I don't wanna upgrade and there is an already installed service, why
should it be exit 1 and not exit 0? Shouldn't it be like

if [ ! "$SOLR_UPGRADE" = "YES" ]; then

  if [ -f "/etc/init.d/$SOLR_SERVICE" ]; then

print_usage "/etc/init.d/$SOLR_SERVICE already exists! Perhaps Solr is
already setup as a service on this host? To upgrade Solr use the -f option."

*exit 0*

  fi


Thanks!

On Fri, May 19, 2017 at 1:59 PM, Chetas Joshi <chetas.jo...@gmail.com>
wrote:

> Hello,
>
> I am trying to set up a solrCloud (6.5.0/6.5.1). I have installed Solr as
> a service.
> Every time I start solr servers, they come up but one by one the
> coreContainers start shutting down on their own within 1-2 minutes of their
> being up.
>
> Here are the solr logs
>
> 2017-05-19 20:45:30.926 INFO  (main) [   ] o.e.j.s.Server Started @1600ms
>
> 2017-05-19 20:47:21.252 INFO  (ShutdownMonitor) [   ]
> o.a.s.c.CoreContainer Shutting down CoreContainer instance=1364767791
>
> 2017-05-19 20:47:21.262 INFO  (ShutdownMonitor) [   ] o.a.s.c.Overseer
> Overseer (id=169527934494244988-:8983_solr-n_06) closing
>
> 2017-05-19 20:47:21.263 INFO  (OverseerStateUpdate-169527934494244988-
> :8983_solr-n_06) [   ] o.a.s.c.Overseer Overseer Loop
> exiting : :8983_solr
>
> 2017-05-19 20:47:21.268 INFO  (ShutdownMonitor) [   ]
> o.a.s.m.SolrMetricManager Closing metric reporters for: solr.node
>
>
> The coreContainer just shuts down (no info in the solr logs). Is the jetty
> servlet container having some issue? Is it possible to look at the Jetty
> servlet container logs?
>
> Thanks!
>


Solr coreContainer shut down

2017-05-19 Thread Chetas Joshi
Hello,

I am trying to set up a solrCloud (6.5.0/6.5.1). I have installed Solr as a
service.
Every time I start solr servers, they come up but one by one the
coreContainers start shutting down on their own within 1-2 minutes of their
being up.

Here are the solr logs

2017-05-19 20:45:30.926 INFO  (main) [   ] o.e.j.s.Server Started @1600ms

2017-05-19 20:47:21.252 INFO  (ShutdownMonitor) [   ] o.a.s.c.CoreContainer
Shutting down CoreContainer instance=1364767791

2017-05-19 20:47:21.262 INFO  (ShutdownMonitor) [   ] o.a.s.c.Overseer
Overseer (id=169527934494244988-:8983_solr-n_06) closing

2017-05-19 20:47:21.263 INFO
(OverseerStateUpdate-169527934494244988-:8983_solr-n_06)
[   ] o.a.s.c.Overseer Overseer Loop exiting : :8983_solr

2017-05-19 20:47:21.268 INFO  (ShutdownMonitor) [   ]
o.a.s.m.SolrMetricManager Closing metric reporters for: solr.node


The coreContainer just shuts down (no info in the solr logs). Is the jetty
servlet container having some issue? Is it possible to look at the Jetty
servlet container logs?

Thanks!


Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-13 Thread Chetas Joshi
Hi Shawn,

Thanks for the insights into the memory requirements. Looks like cursor
approach is going to require a lot of memory for millions of documents.
If I run a query that returns only 500K documents still keeping 100K docs
per page, I don't see long GC pauses. So it is not really the number of
rows per page but the overall number of docs. May be I can reduce the
document cache and the field cache. What do you think?

Erick,

I was using the streaming approach to get back results from Solr but I was
running into some run time exceptions. That bug has been fixed in solr 6.0.
But because of some reasons, I won't be able to move to Java 8 and hence I
will have to stick to solr 5.5.0. That is the reason I had to switch to the
cursor approach.

Thanks!

On Wed, Apr 12, 2017 at 8:37 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You're missing the point of my comment. Since they already are
> docValues, you can use the /export functionality to get the results
> back as a _stream_ and avoid all of the overhead of the aggregator
> node doing a merge sort and all of that.
>
> You'll have to do this from SolrJ, but see CloudSolrStream. You can
> see examples of its usage in StreamingTest.java.
>
> this should
> 1> complete much, much faster. The design goal is 400K rows/second but YMMV
> 2> use vastly less memory on your Solr instances.
> 3> only require _one_ query
>
> Best,
> Erick
>
> On Wed, Apr 12, 2017 at 7:36 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 4/12/2017 5:19 PM, Chetas Joshi wrote:
> >> I am getting back 100K results per page.
> >> The fields have docValues enabled and I am getting sorted results based
> on "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes).
> >>
> >> I have a solr Cloud of 80 nodes. There will be one shard that will get
> top 100K docs from each shard and apply merge sort. So, the max memory
> usage of any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap
> memory usage shoot up from 8 GB to 17 GB?
> >
> > From what I understand, Java overhead for a String object is 56 bytes
> > above the actual byte size of the string itself.  And each character in
> > the string will be two bytes -- Java uses UTF-16 for character
> > representation internally.  If I'm right about these numbers, it means
> > that each of those id values will take 120 bytes -- and that doesn't
> > include the size the actual response (xml, json, etc).
> >
> > I don't know what the overhead for a long is, but you can be sure that
> > it's going to take more than eight bytes total memory usage for each one.
> >
> > Then there is overhead for all the Lucene memory structures required to
> > execute the query and gather results, plus Solr memory structures to
> > keep track of everything.  I have absolutely no idea how much memory
> > Lucene and Solr use to accomplish a query, but it's not going to be
> > small when you have 200 million documents per shard.
> >
> > Speaking of Solr memory requirements, under normal query circumstances
> > the aggregating node is going to receive at least 100K results from
> > *every* shard in the collection, which it will condense down to the
> > final result with 100K entries.  The behavior during a cursor-based
> > request may be more memory-efficient than what I have described, but I
> > am unsure whether that is the case.
> >
> > If the cursor behavior is not more efficient, then each entry in those
> > results will contain the uniqueKey value and the score.  That's going to
> > be many megabytes for every shard.  If there are 80 shards, it would
> > probably be over a gigabyte for one request.
> >
> > Thanks,
> > Shawn
> >
>


Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
I am getting back 100K results per page.
The fields have docValues enabled and I am getting sorted results based on
"id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes).

I have a solr Cloud of 80 nodes. There will be one shard that will get top
100K docs from each shard and apply merge sort. So, the max memory usage of
any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap memory
usage shoot up from 8 GB to 17 GB?

Thanks!

On Wed, Apr 12, 2017 at 1:32 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Oh my. Returning 100K rows per request is usually poor practice.
> One hopes these are very tiny docs.
>
> But this may well be an "XY" problem. What kinds of information
> are you returning in your docs and could they all be docValues
> types? In which case you would be waaay far ahead by using
> the various Streaming options.
>
> Best,
> Erick
>
> On Wed, Apr 12, 2017 at 12:59 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > I am running a query that returns 10 MM docs in total and the number of
> > rows per page is 100K.
> >
> > On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnev <gge...@gmail.com>
> wrote:
> >
> >> And what is the rows parameter?
> >>
> >> 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" <
> chetas.jo...@gmail.com>
> >> написал:
> >>
> >> > Thanks for your response Shawn and Wunder.
> >> >
> >> > Hi Shawn,
> >> >
> >> > Here is the system config:
> >> >
> >> > Total system memory = 512 GB
> >> > each server handles two 500 MB cores
> >> > Number of solr docs per 500 MB core = 200 MM
> >> >
> >> > The average heap usage is around 4-6 GB. When the read starts using
> the
> >> > Cursor approach, the heap usage starts increasing with the base of the
> >> > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full
> GC,
> >> the
> >> > heap usage remains around 15 GB and then it comes down to 8 GB.
> >> >
> >> > With 100K docs, the requirement will be in MBs so it is strange it is
> >> > jumping from 8 GB to 17 GB while preparing the sorted response.
> >> >
> >> > Thanks!
> >> >
> >> >
> >> >
> >> > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood <
> wun...@wunderwood.org
> >> >
> >> > wrote:
> >> >
> >> > > JVM version? We’re running v8 update 121 with the G1 collector and
> it
> >> is
> >> > > working really well. We also have an 8GB heap.
> >> > >
> >> > > Graph your heap usage. You’ll see a sawtooth shape, where it grows,
> >> then
> >> > > there is a major GC. The maximum of the base of the sawtooth is the
> >> > working
> >> > > set of heap that your Solr installation needs. Set the heap to that
> >> > value,
> >> > > plus a gigabyte or so. We run with a 2GB eden (new space) because so
> >> much
> >> > > of Solr’s allocations have a lifetime of one request. So, the base
> of
> >> the
> >> > > sawtooth, plus a gigabyte breathing room, plus two more for eden.
> That
> >> > > should work.
> >> > >
> >> > > I don’t set all the ratios and stuff. When were running CMS, I set a
> >> size
> >> > > for the heap and a size for the new space. Done. With G1, I don’t
> even
> >> > get
> >> > > that fussy.
> >> > >
> >> > > wunder
> >> > > Walter Underwood
> >> > > wun...@wunderwood.org
> >> > > http://observer.wunderwood.org/  (my blog)
> >> > >
> >> > >
> >> > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey <apa...@elyograg.org>
> >> wrote:
> >> > > >
> >> > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote:
> >> > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold
> >> > collection
> >> > > >> with number of shards = 80 and replication Factor=2
> >> > > >>
> >> > > >> Sold JVM heap size = 20 GB
> >> > > >> solr.hdfs.blockcache.enabled = true
> >> > > >> solr.hdfs.blockcache.direct.memory.allocation = true
> >> > > >> MaxDirectMemorySize = 25 GB
> >> > > >>
> >> > > >> I am querying a 

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
I am running a query that returns 10 MM docs in total and the number of
rows per page is 100K.

On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnev <gge...@gmail.com> wrote:

> And what is the rows parameter?
>
> 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" <chetas.jo...@gmail.com>
> написал:
>
> > Thanks for your response Shawn and Wunder.
> >
> > Hi Shawn,
> >
> > Here is the system config:
> >
> > Total system memory = 512 GB
> > each server handles two 500 MB cores
> > Number of solr docs per 500 MB core = 200 MM
> >
> > The average heap usage is around 4-6 GB. When the read starts using the
> > Cursor approach, the heap usage starts increasing with the base of the
> > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC,
> the
> > heap usage remains around 15 GB and then it comes down to 8 GB.
> >
> > With 100K docs, the requirement will be in MBs so it is strange it is
> > jumping from 8 GB to 17 GB while preparing the sorted response.
> >
> > Thanks!
> >
> >
> >
> > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood <wun...@wunderwood.org
> >
> > wrote:
> >
> > > JVM version? We’re running v8 update 121 with the G1 collector and it
> is
> > > working really well. We also have an 8GB heap.
> > >
> > > Graph your heap usage. You’ll see a sawtooth shape, where it grows,
> then
> > > there is a major GC. The maximum of the base of the sawtooth is the
> > working
> > > set of heap that your Solr installation needs. Set the heap to that
> > value,
> > > plus a gigabyte or so. We run with a 2GB eden (new space) because so
> much
> > > of Solr’s allocations have a lifetime of one request. So, the base of
> the
> > > sawtooth, plus a gigabyte breathing room, plus two more for eden. That
> > > should work.
> > >
> > > I don’t set all the ratios and stuff. When were running CMS, I set a
> size
> > > for the heap and a size for the new space. Done. With G1, I don’t even
> > get
> > > that fussy.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
> > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey <apa...@elyograg.org>
> wrote:
> > > >
> > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote:
> > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold
> > collection
> > > >> with number of shards = 80 and replication Factor=2
> > > >>
> > > >> Sold JVM heap size = 20 GB
> > > >> solr.hdfs.blockcache.enabled = true
> > > >> solr.hdfs.blockcache.direct.memory.allocation = true
> > > >> MaxDirectMemorySize = 25 GB
> > > >>
> > > >> I am querying a solr collection with index size = 500 MB per core.
> > > >
> > > > I see that you and I have traded messages before on the list.
> > > >
> > > > How much total system memory is there per server?  How many of these
> > > > 500MB cores are on each server?  How many docs are in a 500MB core?
> > The
> > > > answers to these questions may affect the other advice that I give
> you.
> > > >
> > > >> The off-heap (25 GB) is huge so that it can load the entire index.
> > > >
> > > > I still know very little about how HDFS handles caching and memory.
> > You
> > > > want to be sure that as much data as possible from your indexes is
> > > > sitting in local memory on the server.
> > > >
> > > >> Using cursor approach (number of rows = 100K), I read 2 fields
> (Total
> > 40
> > > >> bytes per solr doc) from the Solr docs that satisfy the query. The
> > docs
> > > are sorted by "id" and then by those 2 fields.
> > > >>
> > > >> I am not able to understand why the heap memory is getting full and
> > Full
> > > >> GCs are consecutively running with long GC pauses (> 30 seconds). I
> am
> > > >> using CMS GC.
> > > >
> > > > A 20GB heap is quite large.  Do you actually need it to be that
> large?
> > > > If you graph JVM heap usage over a long period of time, what are the
> > low
> > > > points in the graph?
> > > >
> > > > A result containing 100K docs is going to be pretty large, even with
> a
> >

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
Thanks for your response Shawn and Wunder.

Hi Shawn,

Here is the system config:

Total system memory = 512 GB
each server handles two 500 MB cores
Number of solr docs per 500 MB core = 200 MM

The average heap usage is around 4-6 GB. When the read starts using the
Cursor approach, the heap usage starts increasing with the base of the
sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, the
heap usage remains around 15 GB and then it comes down to 8 GB.

With 100K docs, the requirement will be in MBs so it is strange it is
jumping from 8 GB to 17 GB while preparing the sorted response.

Thanks!



On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood <wun...@wunderwood.org>
wrote:

> JVM version? We’re running v8 update 121 with the G1 collector and it is
> working really well. We also have an 8GB heap.
>
> Graph your heap usage. You’ll see a sawtooth shape, where it grows, then
> there is a major GC. The maximum of the base of the sawtooth is the working
> set of heap that your Solr installation needs. Set the heap to that value,
> plus a gigabyte or so. We run with a 2GB eden (new space) because so much
> of Solr’s allocations have a lifetime of one request. So, the base of the
> sawtooth, plus a gigabyte breathing room, plus two more for eden. That
> should work.
>
> I don’t set all the ratios and stuff. When were running CMS, I set a size
> for the heap and a size for the new space. Done. With G1, I don’t even get
> that fussy.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Apr 11, 2017, at 8:22 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> >
> > On 4/11/2017 2:56 PM, Chetas Joshi wrote:
> >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection
> >> with number of shards = 80 and replication Factor=2
> >>
> >> Sold JVM heap size = 20 GB
> >> solr.hdfs.blockcache.enabled = true
> >> solr.hdfs.blockcache.direct.memory.allocation = true
> >> MaxDirectMemorySize = 25 GB
> >>
> >> I am querying a solr collection with index size = 500 MB per core.
> >
> > I see that you and I have traded messages before on the list.
> >
> > How much total system memory is there per server?  How many of these
> > 500MB cores are on each server?  How many docs are in a 500MB core?  The
> > answers to these questions may affect the other advice that I give you.
> >
> >> The off-heap (25 GB) is huge so that it can load the entire index.
> >
> > I still know very little about how HDFS handles caching and memory.  You
> > want to be sure that as much data as possible from your indexes is
> > sitting in local memory on the server.
> >
> >> Using cursor approach (number of rows = 100K), I read 2 fields (Total 40
> >> bytes per solr doc) from the Solr docs that satisfy the query. The docs
> are sorted by "id" and then by those 2 fields.
> >>
> >> I am not able to understand why the heap memory is getting full and Full
> >> GCs are consecutively running with long GC pauses (> 30 seconds). I am
> >> using CMS GC.
> >
> > A 20GB heap is quite large.  Do you actually need it to be that large?
> > If you graph JVM heap usage over a long period of time, what are the low
> > points in the graph?
> >
> > A result containing 100K docs is going to be pretty large, even with a
> > limited number of fields.  It is likely to be several megabytes.  It
> > will need to be entirely built in the heap memory before it is sent to
> > the client -- both as Lucene data structures (which will probably be
> > much larger than the actual response due to Java overhead) and as the
> > actual response format.  Then it will be garbage as soon as the response
> > is done.  Repeat this enough times, and you're going to go through even
> > a 20GB heap pretty fast, and need a full GC.  Full GCs on a 20GB heap
> > are slow.
> >
> > You could try switching to G1, as long as you realize that you're going
> > against advice from Lucene experts but honestly, I do not expect
> > this to really help, because you would probably still need full GCs due
> > to the rate that garbage is being created.  If you do try it, I would
> > strongly recommend the latest Java 8, either Oracle or OpenJDK.  Here's
> > my wiki page where I discuss this:
> >
> > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_
> First.29_Collector
> >
> > Reducing the heap size (which may not be possible -- need to know the
> > answer to the question about memory graphing) and reducing the number of
> > rows per query are the only quick solutions I can think of.
> >
> > Thanks,
> > Shawn
> >
>
>


Long GC pauses while reading Solr docs using Cursor approach

2017-04-11 Thread Chetas Joshi
Hello,

I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection
with number of shards = 80 and replication Factor=2

Sold JVM heap size = 20 GB
solr.hdfs.blockcache.enabled = true
solr.hdfs.blockcache.direct.memory.allocation = true
MaxDirectMemorySize = 25 GB

I am querying a solr collection with index size = 500 MB per core.

The off-heap (25 GB) is huge so that it can load the entire index.

Using cursor approach (number of rows = 100K), I read 2 fields (Total 40
bytes per solr doc) from the Solr docs that satisfy the query. The docs are
sorted by "id" and then by those 2 fields.

I am not able to understand why the heap memory is getting full and Full
GCs are consecutively running with long GC pauses (> 30 seconds). I am
using CMS GC.

-XX:NewRatio=3 \

-XX:SurvivorRatio=4 \

-XX:TargetSurvivorRatio=90 \

-XX:MaxTenuringThreshold=8 \

-XX:+UseConcMarkSweepGC \

-XX:+UseParNewGC \

-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \

-XX:+CMSScavengeBeforeRemark \

-XX:PretenureSizeThreshold=64m \

-XX:+UseCMSInitiatingOccupancyOnly \

-XX:CMSInitiatingOccupancyFraction=50 \

-XX:CMSMaxAbortablePrecleanTime=6000 \

-XX:+CMSParallelRemarkEnabled \

-XX:+ParallelRefProcEnabled"


Please guide me in debugging the heap usage issue.


Thanks!


Re: CloudSolrClient stuck in a loop with a recurring exception

2017-02-22 Thread Chetas Joshi
Yes, it is scala.
And yes, I just wanted to confirm that I had to add exception handling and
break out of the loop.

Chetas.

On Wed, Feb 22, 2017 at 4:25 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 2/22/2017 4:59 PM, Chetas Joshi wrote:
> > 2017-02-22 15:27:06,994 ERROR o.a.s.c.solrj.impl.CloudSolrClient ~
> Request
> > to collection xx failed due to (510) org.apache.solr.common.
> > SolrException: Could not find a healthy node to handle the request.,
> retry?
> >
> > Here is my code snippet. I go through a loop until I get the last page to
> > get back all the results from Solr using the cursor approach. Do I need
> to
> > take care of the above situation/exceptions in my code?
> >
> > while(true){
> >
> > val rsp: QueryResponse = *cloudSolrclient*.query(cursorQ)
> > val nextCursorMark: String = rsp.getNextCursorMark
> >
> > val nextCursorMark: String = rsp.getNextCursorMark
> >
> > if (cursorMark.equals(nextCursorMark)) break; cloudSolrClient.close()
> >
> > }
>
> This doesn't look like Java code, so I'm assuming it's Scala, and I do
> not have any experience with that language.  There doesn't seem to be
> any exception handling.  The query method will throw an exception if the
> server's not available.  You must handle that in your code and take
> appropriate action, such as breaking out of the loop.
>
> Thanks,
> Shawn
>
>


CloudSolrClient stuck in a loop with a recurring exception

2017-02-22 Thread Chetas Joshi
Hello,

I am using Solr 5.5.1. Solr Cloud of 80 nodes deployed on HDFS.

To get back results from Solr, I use the cursor approach and the
cloudSolrClient object. While a query was running, I took the solr Cloud
down. The client got stuck in a loop with the following exception:

2017-02-22 15:27:06,985 ERROR o.a.s.c.solrj.impl.CloudSolrClient ~ Request
to collection x failed due to (510) org.apache.solr.common.SolrException
: Could not find a healthy node to handle the request., retry? 0
.
.
.
.
.
2017-02-22 15:27:06,994 ERROR o.a.s.c.solrj.impl.CloudSolrClient ~ Request
to collection xx failed due to (510) org.apache.solr.common.
SolrException: Could not find a healthy node to handle the request., retry?
5

Here is my code snippet. I go through a loop until I get the last page to
get back all the results from Solr using the cursor approach. Do I need to
take care of the above situation/exceptions in my code?

while(true){

val rsp: QueryResponse = *cloudSolrclient*.query(cursorQ)
val nextCursorMark: String = rsp.getNextCursorMark

val nextCursorMark: String = rsp.getNextCursorMark

if (cursorMark.equals(nextCursorMark)) break; cloudSolrClient.close()

}

These exceptions are getting generated from *cloudSolrclient*.query(cursorQ)
while getting back the response. Should I catch those exceptions and close
the client if they cross a particular threshold?

Chetas.


Re: A collection gone missing: uninteresting collection

2017-01-21 Thread Chetas Joshi
Is this visible in the logs? I mean how do I find out that a "DELETE
collection" API​ call was made?

Is the following indicative of the fact that the API call was made?

2017-01-20 20:42:39,822 INFO org.apache.solr.cloud.
ShardLeaderElectionContextBase: Removing leader registration node on
cancel: /collections/3044_01_17/leaders/shard4/leader 9

2017-01-20 20:42:39,832 INFO org.apache.solr.cloud.ElectionContext:
Canceling election /collections/3044_01_17/leader_elect/shard4/election/
241183598302995297-core_node3-n_08

2017-01-20 20:42:39,833 INFO org.apache.solr.common.cloud.ZkStateReader:
Removing watch for uninteresting collection [3044_01_17]

  "core":"3044_01_17_shard4_replica1",

  "collection":"3044_01_17",


I am confused as the logs just talk about shard4 not all the shards of the
collection.


A collection gone missing: uninteresting collection

2017-01-20 Thread Chetas Joshi
Hello,

I have been running Solr (5.5.0) on HDFS.

Recently a collection just went missing with all the instanceDirs and
Datadirs getting deleted. The following logs in the solrCloud overseer.

2017-01-20 20:42:39,515 INFO org.apache.solr.core.SolrCore:
[3044_01_17_shard4_replica1]  CLOSING SolrCore
org.apache.solr.core.SolrCore@2e2e0a23

2017-01-20 20:42:39,665 INFO org.apache.solr.core.SolrCore:
[3044_01_17_shard4_replica1] Closing main searcher on request.

2017-01-20 20:42:39,690 INFO org.apache.solr.core.CachingDirectoryFactory:
looking to close hdfs://Ingest/solr53/3044_01_17/core_node3/data

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-13 Thread Chetas Joshi
Erick, I have not changed any config. I have autoaddReplica = true for
individual collection config as well as the overall cluster config. Still,
it does not add a replica when I decommission a node.

Adding a replica is overseer's job. I looked at the logs of the overseer of
the solrCloud but could not find anything there as well.

I am doing some testing using different configs. I would be happy to share
my finding.

One of the things I have observed is: if I use the collection API to create
a replica for that shard, it does not complain about the config which has
been set to ReplicationFactor=1. If replication factor was the issue as
suggested by Shawn, shouldn't it complain?

I would also like to mention that I experience some instance dirs getting
deleted and also found this open bug (
https://issues.apache.org/jira/browse/SOLR-8905)

Thanks!

On Thu, Jan 12, 2017 at 9:50 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Hmmm, have you changed any of the settings for autoAddReplcia? There
> are several parameters that govern how long before a replica would be
> added.
>
> But I suggest you use the Cloudera resources for this question, not
> only did they write this functionality, but Cloudera support is deeply
> embedded in HDFS and I suspect has _by far_ the most experience with
> it.
>
> And that said, anything you find out that would suggest good ways to
> clarify the docs would be most welcome!
>
> Best,
> Erick
>
> On Thu, Jan 12, 2017 at 8:42 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 1/11/2017 7:14 PM, Chetas Joshi wrote:
> >> This is what I understand about how Solr works on HDFS. Please correct
> me
> >> if I am wrong.
> >>
> >> Although solr shard replication Factor = 1, HDFS default replication =
> 3.
> >> When the node goes down, the solr server running on that node goes down
> and
> >> hence the instance (core) representing the replica goes down. The data
> in
> >> on HDFS (distributed across all the datanodes of the hadoop cluster
> with 3X
> >> replication).  This is the reason why I have kept replicationFactor=1.
> >>
> >> As per the link:
> >> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
> >> One benefit to running Solr in HDFS is the ability to automatically add
> new
> >> replicas when the Overseer notices that a shard has gone down. Because
> the
> >> "gone" index shards are stored in HDFS, a new core will be created and
> the
> >> new core will point to the existing indexes in HDFS.
> >>
> >> This is the expected behavior of Solr overseer which I am not able to
> see.
> >> After a couple of hours a node was assigned to host the shard but the
> >> status of the shard is still "down" and the instance dir is missing on
> that
> >> node for that particular shard_replica.
> >
> > As I said before, I know very little about HDFS, so the following could
> > be wrong, but it makes sense so I'll say it:
> >
> > I would imagine that Solr doesn't know or care what your HDFS
> > replication is ... the only replicas it knows about are the ones that it
> > is managing itself.  The autoAddReplicas feature manages *SolrCloud*
> > replicas, not HDFS replicas.
> >
> > I have seen people say that multiple SolrCloud replicas will take up
> > additional space in HDFS -- they do not point at the same index files.
> > This is because proper Lucene operation requires that it lock an index
> > and prevent any other thread/process from writing to the index at the
> > same time.  When you index, SolrCloud updates all replicas independently
> > -- the only time indexes are replicated is when you add a new replica or
> > a serious problem has occurred and an index needs to be recovered.
> >
> > Thanks,
> > Shawn
> >
>


Solr on HDFS: AutoAddReplica does not add a replica

2017-01-11 Thread Chetas Joshi
Hello,

I have deployed a SolrCloud (solr 5.5.0) on hdfs using cloudera 5.4.7. The
cloud has 86 nodes.

This is my config for the collection

numShards=80
ReplicationFactor=1
maxShardsPerNode=1
autoAddReplica=true

I recently decommissioned a node to resolve some disk issues. The shard
that was being hosted on that host is now being shown as "gone" on the solr
admin UI.

The got the cluster status using the collection API. It says
shard: active, replica: down

The overseer does not seem to be creating an extra core even though
autoAddReplica=true (
https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS).

Is this happening because the overseer sees the shard as active as
suggested by the cluster status?
If yes, is "autoAddReplica" not reliable? should I add a replica for this
shard when such cases arise?

Thanks!


Re: changing state.json using ZKCLI

2017-01-11 Thread Chetas Joshi
Thanks Shawn and Erick!

On Wed, Jan 11, 2017 at 6:18 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> BTW, since Solr 6.2 you can get/put arbitrary znodes using the
> bin/solr script with more unix-like commands. To see the options type
> bin/solr zk -help
>
> You can get the same functionality out of either, it's a matter of
> which one you're more comfortable with.
>
> Erick
>
>
>
> On Tue, Jan 10, 2017 at 11:12 PM, Shawn Heisey <apa...@elyograg.org>
> wrote:
> > On 1/10/2017 5:28 PM, Chetas Joshi wrote:
> >> I have got 2 shards having hash range set to null due to some index
> >> corruption.
> >>
> >> I am trying to manually get, edit and put the file.
> > 
> >> ./zkcli.sh -zkhost ${zkhost} -cmd putfile ~/colName_state.json
> >> /collections/colName/state.json
> >>
> >> I am getting FileNotFound exception with the putfile command
> >
> > You've got the parameters backwards.  The zookeeper location comes first.
> >
> > Run the zkcli script with no parameters to see the examples of the
> > various commands available, which includes this line:
> >
> > zkcli.sh -zkhost localhost:9983 -cmd putfile /solr.xml
> > /User/myuser/solr/solr.xml
> >
> > Thanks,
> > Shawn
> >
>


changing state.json using ZKCLI

2017-01-10 Thread Chetas Joshi
Hello,

I have got 2 shards having hash range set to null due to some index
corruption.

I am trying to manually get, edit and put the file.

./zkcli.sh -zkhost ${zkhost} -cmd getfile /collections/colName/state.json
~/colName_state.json


./zkcli.sh -zkhost ${zkhost} -cmd clear /collections/colName/state.json


./zkcli.sh -zkhost ${zkhost} -cmd putfile ~/colName_state.json
/collections/colName/state.json


I am getting FileNotFound exception with the putfile command


Exception in thread "main" java.io.FileNotFoundException:
/collections/colName/state.json (No such file or directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.(FileInputStream.java:146)

at java.io.FileInputStream.(FileInputStream.java:101)


I just got the file from the same location.

Why is it throwing this exception?

How should I find out the correct location on the zookeeper node?


Thanks!


Re: Missing shards/hash range

2017-01-10 Thread Chetas Joshi
Want to add a couple of things

1) Shards were not deleted using the delete replica collection API
endpoint.
2) instanceDir and dataDir exist for all 20 shards.

On Tue, Jan 10, 2017 at 11:34 AM, Chetas Joshi <chetas.jo...@gmail.com>
wrote:

> Hello,
>
> The following is my config
>
> Solr 5.5.0 on HDFS (SolrCloud of 25 nodes)
> collection with shards=20, maxShards per node=1, replicationFactor=1,
> autoAddReplicas=true
>
> The ingestion process had been working fine for the last 3 months.
>
> Yesterday, the ingestion process started throwing the following exceptions:
> SolrException: No active slice servicing hash code 7270a60c in
> DocCollection()
>
> I can see that suddenly 2 shards missing. Solr Cloud UI says number of
> shards for the collection are 18. Somehow, shards have got deleted. The
> data is available on hdfs.
>
> Is there a way I can restart those shards on 2 of the hosts and provide a
> particular hash range(The hash ranges that are missing) ?
>
> Thanks!
>
>


Missing shards/hash range

2017-01-10 Thread Chetas Joshi
Hello,

The following is my config

Solr 5.5.0 on HDFS (SolrCloud of 25 nodes)
collection with shards=20, maxShards per node=1, replicationFactor=1,
autoAddReplicas=true

The ingestion process had been working fine for the last 3 months.

Yesterday, the ingestion process started throwing the following exceptions:
SolrException: No active slice servicing hash code 7270a60c in
DocCollection()

I can see that suddenly 2 shards missing. Solr Cloud UI says number of
shards for the collection are 18. Somehow, shards have got deleted. The
data is available on hdfs.

Is there a way I can restart those shards on 2 of the hosts and provide a
particular hash range(The hash ranges that are missing) ?

Thanks!


Re: Solr Initialization failure

2017-01-04 Thread Chetas Joshi
Hi Shawn

Thanks for the explanation!

I have slab count set to 20 and I did not have global block cache.
I have a follow up question. Does setting slab count=1 affect the
write/read performance of Solr while reading the indices from HDFS? Is this
setting just used while creating new cores?

Thanks!

On Wed, Jan 4, 2017 at 4:11 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 1/4/2017 1:43 PM, Chetas Joshi wrote:
> > while creating a new collection, it fails to spin up solr cores on some
> > nodes due to "insufficient direct memory".
> >
> > Here is the error:
> >
> >- *3044_01_17_shard42_replica1:*
> > org.apache.solr.common.SolrException:org.apache.solr.
> common.SolrException:
> >The max direct memory is likely too low. Either increase it (by adding
> >-XX:MaxDirectMemorySize=g -XX:+UseLargePages to your containers
> >startup args) or disable direct allocation using
> >solr.hdfs.blockcache.direct.memory.allocation=false in
> solrconfig.xml. If
> >you are putting the block cache on the heap, your java heap size
> might not
> >be large enough. Failed allocating ~2684.35456 MB.
> >
> > The error is self explanatory.
> >
> > My question is: why does it require around 2.7 GB of off-heap memory to
> > spin up a single core??
>
> This message comes from the HdfsDirectoryFactory class.  This is the
> calculation of the total amount of memory needed:
>
> long totalMemory = (long) bankCount * (long) numberOfBlocksPerBank
> * (long) blockSize;
>
> The numberOfBlocksPerBank variable can come from the configuration, the
> code defaults it to 16384.  The blockSize variable gets assigned by a
> convoluted method involving bit shifts, and defaults to 8192.   The
> bankCount variable seems to come from solr.hdfs.blockcache.slab.count,
> and apparently defaults to 1.  Looks like it's been set to 20 on your
> config.  If we assume the other two are at their defaults and you have
> 20 for the slab count, then this results in 2684354560 bytes, which
> would cause the exact output seen in the error message when the memory
> allocation fails.
>
> I know very little about HDFS or how the HDFS directory works, but
> apparently it needs a lot of memory if you want good performance.
> Reducing solr.hdfs.blockcache.slab.count sounds like it might result in
> less memory being required.
>
> You might want to review this page for info about how to set up HDFS,
> where it says that each slab requires 128MB of memory:
>
> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
>
> The default settings for the HDFS directory cause the block cache to be
> global, so all cores use it, instead of spinning up another cache for
> every additional core.
>
> What I've seen sounds like one of these two problems:  1) You've turned
> off the global cache option.  2) This node doesn't yet have any HDFS
> cores, so your collection create is tryin to create the first core using
> HDFS.  That action is trying to allocate the global cache, which has
> been sized at 20 slabs.
>
> Thanks,
> Shawn
>
>


Solr Initialization failure

2017-01-04 Thread Chetas Joshi
Hello,

while creating a new collection, it fails to spin up solr cores on some
nodes due to "insufficient direct memory".

Here is the error:

   - *3044_01_17_shard42_replica1:*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
   The max direct memory is likely too low. Either increase it (by adding
   -XX:MaxDirectMemorySize=g -XX:+UseLargePages to your containers
   startup args) or disable direct allocation using
   solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml. If
   you are putting the block cache on the heap, your java heap size might not
   be large enough. Failed allocating ~2684.35456 MB.

The error is self explanatory.

My question is: why does it require around 2.7 GB of off-heap memory to
spin up a single core??

Thank you!


Re: Solr on HDFS: Streaming API performance tuning

2016-12-19 Thread Chetas Joshi
Hi Joel,

I don't have any solr documents that have NULL values for the sort fields I
use in my queries.

Thanks!

On Sun, Dec 18, 2016 at 12:56 PM, Joel Bernstein <joels...@gmail.com> wrote:

> Ok, based on the stack trace I suspect one of your sort fields has NULL
> values, which in the 5x branch could produce null pointers if a segment had
> no values for a sort field. This is also fixed in the Solr 6x branch.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Sat, Dec 17, 2016 at 2:44 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
>
> > Here is the stack trace.
> >
> > java.lang.NullPointerException
> >
> > at
> > org.apache.solr.client.solrj.io.comp.FieldComparator$2.
> > compare(FieldComparator.java:85)
> >
> > at
> > org.apache.solr.client.solrj.io.comp.FieldComparator.
> > compare(FieldComparator.java:92)
> >
> > at
> > org.apache.solr.client.solrj.io.comp.FieldComparator.
> > compare(FieldComparator.java:30)
> >
> > at
> > org.apache.solr.client.solrj.io.comp.MultiComp.compare(
> MultiComp.java:45)
> >
> > at
> > org.apache.solr.client.solrj.io.comp.MultiComp.compare(
> MultiComp.java:33)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> > TupleWrapper.compareTo(CloudSolrStream.java:396)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> > TupleWrapper.compareTo(CloudSolrStream.java:381)
> >
> > at java.util.TreeMap.put(TreeMap.java:560)
> >
> > at java.util.TreeSet.add(TreeSet.java:255)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream._
> > read(CloudSolrStream.java:366)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream.
> > read(CloudSolrStream.java:353)
> >
> > at
> >
> > *.*.*.*.SolrStreamResultIterator$$anon$1.run(SolrStreamResultIterator.
> > scala:101)
> >
> > at java.lang.Thread.run(Thread.java:745)
> >
> > 16/11/17 13:04:31 *ERROR* SolrStreamResultIterator:missing exponent
> > number:
> > char=A,position=106596
> > BEFORE='p":1477189323},{"uuid":"//699/UzOPQx6thu","timestamp": 6EA'
> > AFTER='E 1476861439},{"uuid":"//699/vG8k4Tj'
> >
> > org.noggit.JSONParser$ParseException: missing exponent number:
> > char=A,position=106596
> > BEFORE='p":1477189323},{"uuid":"//699/UzOPQx6thu","timestamp": 6EA'
> > AFTER='E 1476861439},{"uuid":"//699/vG8k4Tj'
> >
> > at org.noggit.JSONParser.err(JSONParser.java:356)
> >
> > at org.noggit.JSONParser.readExp(JSONParser.java:513)
> >
> > at org.noggit.JSONParser.readNumber(JSONParser.java:419)
> >
> > at org.noggit.JSONParser.next(JSONParser.java:845)
> >
> > at org.noggit.JSONParser.nextEvent(JSONParser.java:951)
> >
> > at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:127)
> >
> > at org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:57)
> >
> > at org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:37)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.JSONTupleStream.
> > next(JSONTupleStream.java:84)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.SolrStream.read(
> > SolrStream.java:147)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream$
> TupleWrapper.next(
> > CloudSolrStream.java:413)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream._
> > read(CloudSolrStream.java:365)
> >
> > at
> > org.apache.solr.client.solrj.io.stream.CloudSolrStream.
> > read(CloudSolrStream.java:353)
> >
> >
> > Thanks!
> >
> > On Fri, Dec 16, 2016 at 11:45 PM, Reth RM <reth.ik...@gmail.com> wrote:
> >
> > > If you could provide the json parse exception stack trace, it might
> help
> > to
> > > predict issue there.
> > >
> > >
> > > On Fri, Dec 16, 2016 at 5:52 PM, Chetas Joshi <chetas.jo...@gmail.com>
> > > wrote:
> > >
> > > > Hi Joel,
> > > >
> > > > The only NON alpha-numeric characters I have in my data are '+' and
> > '/'.
> > > I
> > > > don't have any backslashes.
> > > >
> > > > If the sp

Re: Solr on HDFS: Streaming API performance tuning

2016-12-17 Thread Chetas Joshi
Here is the stack trace.

java.lang.NullPointerException

at
org.apache.solr.client.solrj.io.comp.FieldComparator$2.compare(FieldComparator.java:85)

at
org.apache.solr.client.solrj.io.comp.FieldComparator.compare(FieldComparator.java:92)

at
org.apache.solr.client.solrj.io.comp.FieldComparator.compare(FieldComparator.java:30)

at
org.apache.solr.client.solrj.io.comp.MultiComp.compare(MultiComp.java:45)

at
org.apache.solr.client.solrj.io.comp.MultiComp.compare(MultiComp.java:33)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream$TupleWrapper.compareTo(CloudSolrStream.java:396)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream$TupleWrapper.compareTo(CloudSolrStream.java:381)

at java.util.TreeMap.put(TreeMap.java:560)

at java.util.TreeSet.add(TreeSet.java:255)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream._read(CloudSolrStream.java:366)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream.read(CloudSolrStream.java:353)

at

*.*.*.*.SolrStreamResultIterator$$anon$1.run(SolrStreamResultIterator.scala:101)

at java.lang.Thread.run(Thread.java:745)

16/11/17 13:04:31 *ERROR* SolrStreamResultIterator:missing exponent number:
char=A,position=106596
BEFORE='p":1477189323},{"uuid":"//699/UzOPQx6thu","timestamp": 6EA'
AFTER='E 1476861439},{"uuid":"//699/vG8k4Tj'

org.noggit.JSONParser$ParseException: missing exponent number:
char=A,position=106596
BEFORE='p":1477189323},{"uuid":"//699/UzOPQx6thu","timestamp": 6EA'
AFTER='E 1476861439},{"uuid":"//699/vG8k4Tj'

at org.noggit.JSONParser.err(JSONParser.java:356)

at org.noggit.JSONParser.readExp(JSONParser.java:513)

at org.noggit.JSONParser.readNumber(JSONParser.java:419)

at org.noggit.JSONParser.next(JSONParser.java:845)

at org.noggit.JSONParser.nextEvent(JSONParser.java:951)

at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:127)

at org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:57)

at org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:37)

at
org.apache.solr.client.solrj.io.stream.JSONTupleStream.next(JSONTupleStream.java:84)

at
org.apache.solr.client.solrj.io.stream.SolrStream.read(SolrStream.java:147)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream$TupleWrapper.next(CloudSolrStream.java:413)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream._read(CloudSolrStream.java:365)

at
org.apache.solr.client.solrj.io.stream.CloudSolrStream.read(CloudSolrStream.java:353)


Thanks!

On Fri, Dec 16, 2016 at 11:45 PM, Reth RM <reth.ik...@gmail.com> wrote:

> If you could provide the json parse exception stack trace, it might help to
> predict issue there.
>
>
> On Fri, Dec 16, 2016 at 5:52 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
>
> > Hi Joel,
> >
> > The only NON alpha-numeric characters I have in my data are '+' and '/'.
> I
> > don't have any backslashes.
> >
> > If the special characters was the issue, I should get the JSON parsing
> > exceptions every time irrespective of the index size and irrespective of
> > the available memory on the machine. That is not the case here. The
> > streaming API successfully returns all the documents when the index size
> is
> > small and fits in the available memory. That's the reason I am confused.
> >
> > Thanks!
> >
> > On Fri, Dec 16, 2016 at 5:43 PM, Joel Bernstein <joels...@gmail.com>
> > wrote:
> >
> > > The Streaming API may have been throwing exceptions because the JSON
> > > special characters were not escaped. This was fixed in Solr 6.0.
> > >
> > >
> > >
> > >
> > >
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Fri, Dec 16, 2016 at 4:34 PM, Chetas Joshi <chetas.jo...@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I am running Solr 5.5.0.
> > > > It is a solrCloud of 50 nodes and I have the following config for all
> > the
> > > > collections.
> > > > maxShardsperNode: 1
> > > > replicationFactor: 1
> > > >
> > > > I was using Streaming API to get back results from Solr. It worked
> fine
> > > for
> > > > a while until the index data size reached beyond 40 GB per shard
> (i.e.
> > > per
> > > > node). It started throwing JSON parsing exceptions while reading the
> > > > TupleStream data. FYI: I have other services (Yarn, Spark) deployed
> on
> > &

Re: Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Chetas Joshi
Hi Joel,

The only NON alpha-numeric characters I have in my data are '+' and '/'. I
don't have any backslashes.

If the special characters was the issue, I should get the JSON parsing
exceptions every time irrespective of the index size and irrespective of
the available memory on the machine. That is not the case here. The
streaming API successfully returns all the documents when the index size is
small and fits in the available memory. That's the reason I am confused.

Thanks!

On Fri, Dec 16, 2016 at 5:43 PM, Joel Bernstein <joels...@gmail.com> wrote:

> The Streaming API may have been throwing exceptions because the JSON
> special characters were not escaped. This was fixed in Solr 6.0.
>
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Dec 16, 2016 at 4:34 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
>
> > Hello,
> >
> > I am running Solr 5.5.0.
> > It is a solrCloud of 50 nodes and I have the following config for all the
> > collections.
> > maxShardsperNode: 1
> > replicationFactor: 1
> >
> > I was using Streaming API to get back results from Solr. It worked fine
> for
> > a while until the index data size reached beyond 40 GB per shard (i.e.
> per
> > node). It started throwing JSON parsing exceptions while reading the
> > TupleStream data. FYI: I have other services (Yarn, Spark) deployed on
> the
> > same boxes on which Solr shards are running. Spark jobs also use a lot of
> > disk cache. So, the free available disk cache on the boxes vary a
> > lot depending upon what else is running on the box.
> >
> > Due to this issue, I moved to using the cursor approach and it works fine
> > but as we all know it is way slower than the streaming approach.
> >
> > Currently the index size per shard is 80GB (The machine has 512 GB of RAM
> > and being used by different services/programs: heap/off-heap and the disk
> > cache requirements).
> >
> > When I have enough RAM (more than 80 GB so that all the index data could
> > fit in memory) available on the machine, the streaming API succeeds
> without
> > running into any exceptions.
> >
> > Question:
> > How different the index data caching mechanism (for HDFS) is for the
> > Streaming API from the cursorMark approach?
> > Why cursor works every time but streaming works only when there is a lot
> of
> > free disk cache?
> >
> > Thank you.
> >
>


Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Chetas Joshi
Hello,

I am running Solr 5.5.0.
It is a solrCloud of 50 nodes and I have the following config for all the
collections.
maxShardsperNode: 1
replicationFactor: 1

I was using Streaming API to get back results from Solr. It worked fine for
a while until the index data size reached beyond 40 GB per shard (i.e. per
node). It started throwing JSON parsing exceptions while reading the
TupleStream data. FYI: I have other services (Yarn, Spark) deployed on the
same boxes on which Solr shards are running. Spark jobs also use a lot of
disk cache. So, the free available disk cache on the boxes vary a
lot depending upon what else is running on the box.

Due to this issue, I moved to using the cursor approach and it works fine
but as we all know it is way slower than the streaming approach.

Currently the index size per shard is 80GB (The machine has 512 GB of RAM
and being used by different services/programs: heap/off-heap and the disk
cache requirements).

When I have enough RAM (more than 80 GB so that all the index data could
fit in memory) available on the machine, the streaming API succeeds without
running into any exceptions.

Question:
How different the index data caching mechanism (for HDFS) is for the
Streaming API from the cursorMark approach?
Why cursor works every time but streaming works only when there is a lot of
free disk cache?

Thank you.


Re: Solr on HDFS: increase in query time with increase in data

2016-12-16 Thread Chetas Joshi
Thank you everyone. I would add nodes to the SolrCloud and split the shards.

Shawn,

Thank you for explaining why putting index data on local file system could
be a better idea than using HDFS. I need to find out how HDFS caches the
index files in a resource constrained environment.

I would also like to add that when I try the Streaming API instead of using
the cursor approach, it starts running into JSON parsing exceptions when my
nodes (running Solr shards) don't have enough RAM to fit the entire index
into memory. FYI: I have other services (Yarn, Spark) deployed on the same
boxes as well. Spark jobs also use a lot of disk cache.
When I have enough RAM (more than 70 GB so that all the index data could
fit in memory), the streaming API succeeds without running into any
exceptions. How different the index data caching mechanism is for the
Streaming API from the cursor approach?

Thanks!



On Fri, Dec 16, 2016 at 6:52 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 12/14/2016 11:58 AM, Chetas Joshi wrote:
> > I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have
> > the following config.
> > maxShardsperNode: 1
> > replicationFactor: 1
> >
> > I have been ingesting data into Solr for the last 3 months. With increase
> > in data, I am observing increase in the query time. Currently the size of
> > my indices is 70 GB per shard (i.e. per node).
>
> Query times will increase as the index size increases, but significant
> jumps in the query time may be an indication of a performance problem.
> Performance problems are usually caused by insufficient resources,
> memory in particular.
>
> With HDFS, I am honestly not sure *where* the cache memory is needed.  I
> would assume that it's needed on the HDFS hosts, that a lot of spare
> memory on the Solr (HDFS client) hosts probably won't make much
> difference.  I could be wrong -- I have no idea what kind of caching
> HDFS does.  If the HDFS client can cache data, then you probably would
> want extra memory on the Solr machines.
>
> > I am using cursor approach (/export handler) using SolrJ client to get
> back
> > results from Solr. All the fields I am querying on and all the fields
> that
> > I get back from Solr are indexed and have docValues enabled as well. What
> > could be the reason behind increase in query time?
>
> If actual disk access is required to satisfy a query, Solr is going to
> be slow.  Caching is absolutely required for good performance.  If your
> query times are really long but used to be short, chances are that your
> index size has exceeded your system's ability to cache it effectively.
>
> One thing to keep in mind:  Gigabit Ethernet is comparable in speed to
> the sustained transfer rate of a single modern SATA magnetic disk, so if
> the data has to traverse a gigabit network, it probably will be nearly
> as slow as it would be if it were coming from a single disk.  Having a
> 10gig network for your storage is probably a good idea ... but current
> fast memory chips can leave 10gig in the dust, so if the data can come
> from cache and the chips are new enough, then it can be faster than
> network storage.
>
> Because the network can be a potential bottleneck, I strongly recommend
> putting index data on local disks.  If you have enough memory, the disk
> doesn't even need to be super-fast.
>
> > Has this got something to do with the OS disk cache that is used for
> > loading the Solr indices? When a query is fired, will Solr wait for all
> > (70GB) of disk cache being available so that it can load the index file?
>
> Caching the files on the disk is not handled by Solr, so Solr won't wait
> for the entire index to be cached unless the underlying storage waits
> for some reason.  The caching is usually handled by the OS.  For HDFS,
> it might be handled by a combination of the OS and Hadoop, but I don't
> know enough about HDFS to comment.  Solr makes a request for the parts
> of the index files that it needs to satisfy the request.  If the
> underlying system is capable of caching the data, if that feature is
> enabled, and if there's memory available for that purpose, then it gets
> cached.
>
> Thanks,
> Shawn
>
>


Solr on HDFS: increase in query time with increase in data

2016-12-14 Thread Chetas Joshi
Hi everyone,

I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have
the following config.
maxShardsperNode: 1
replicationFactor: 1

I have been ingesting data into Solr for the last 3 months. With increase
in data, I am observing increase in the query time. Currently the size of
my indices is 70 GB per shard (i.e. per node).

I am using cursor approach (/export handler) using SolrJ client to get back
results from Solr. All the fields I am querying on and all the fields that
I get back from Solr are indexed and have docValues enabled as well. What
could be the reason behind increase in query time?

Has this got something to do with the OS disk cache that is used for
loading the Solr indices? When a query is fired, will Solr wait for all
(70GB) of disk cache being available so that it can load the index file?

Thnaks!


Re: CloudSolrClient$RouteException: Cannot talk to ZooKeeper - Updates are disabled.

2016-11-21 Thread Chetas Joshi
Thanks Erick and Shawn.

I have reduced number of rows per page from 500K to 100K.
I also increased the ZKclientTimeOut to 30 seconds so that I don't run into
ZK time out issues. The ZK cluster has been deployed on the hosts other
than the SolrCloud hosts.

However, I was trying to increase the number of rows per page due to the
following reason: Running ingestion at the same time as running queries has
increased the amount of time it takes to read results from Solr using the
Cursor approach by 5 times. I am able to read 1M sorted documents in 1 hour
(88 bytes of data per document).

What could be the reason behind the low speed of query execution? I am
running solr servers with heap=16g and off-heap=16g. Off-heap is being used
as the block cache. Do ingestion and query execution both use a lot of
block cache? Should I increase the block cache size in oder to improve the
query performance? Should I increase slab.count or maxDirectMemorySize?

Thanks!

On Sat, Nov 19, 2016 at 8:13 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Returning 500K rows is, as Shawn says, not Solr's sweet spot.
>
> My guess: All the work you're doing trying to return that many
> rows, particularly in SolrCloud mode is simply overloading
> your system to the point that the ZK connection times out. Don't
> do that. If you need that many rows, either Shawn's cursorMark
> option or use export/streaming aggregation are much better
> choices.
>
> Consider what happens on a sharded request:
> - the initial node sends a sub-request to a replica for each shard.
> - each replica returns it's candidate topN (doc ID and sort criteria)
> - the initial node sorts these lists (1M from each replica in your
> example) to get the true top N
> - the initial node requests the docs from each replica that made it
> into the true top N
> - each replica goes to disk, decompresses the doc and pulls out the fields
> - each replica sends its portion of the top N to the initial node
> - an enormous packet containing all 1M final docs is assembled and
> returned to the client.
> - this sucks up bandwidth and resources
> - that's bad enough, but especially if your ZK nodes are on the same
> box as your Solr nodes they're even more like to have a timeout issue.
>
>
> Best,
> Erick
>
> On Fri, Nov 18, 2016 at 8:45 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 11/18/2016 6:50 PM, Chetas Joshi wrote:
> >> The numFound is millions but I was also trying with rows= 1 Million. I
> will reduce it to 500K.
> >>
> >> I am sorry. It is state.json. I am using Solr 5.5.0
> >>
> >> One of the things I am not able to understand is why my ingestion job is
> >> complaining about "Cannot talk to ZooKeeper - Updates are disabled."
> >>
> >> I have a spark streaming job that continuously ingests into Solr. My
> shards are always up and running. The moment I start a query on SolrCloud
> it starts running into this exception. However as you said ZK will only
> update the state of the cluster when the shards go down. Then why my job is
> trying to contact ZK when the cluster is up and why is the exception about
> updating ZK?
> >
> > SolrCloud and SolrJ (CloudSolrClient) both maintain constant connections
> > to all the zookeeper servers they are configured to use.  If zookeeper
> > quorum is lost, SolrCloud will go read-only -- no updating is possible.
> > That is what is meant by "updates are disabled."
> >
> > Solr and Lucene are optimized for very low rowcounts, typically two or
> > three digits.  Asking for hundreds of thousands of rows is problematic.
> > The cursorMark feature is designed for efficient queries when paging
> > deeply into results, but it assumes your rows value is relatively small,
> > and that you will be making many queries to get a large number of
> > results, each of which will be fast and won't overload the server.
> >
> > Since it appears you are having a performance issue, here's a few things
> > I have written on the topic:
> >
> > https://wiki.apache.org/solr/SolrPerformanceProblems
> >
> > Thanks,
> > Shawn
> >
>


Re: CloudSolrClient$RouteException: Cannot talk to ZooKeeper - Updates are disabled.

2016-11-18 Thread Chetas Joshi
Thanks Erick.

The numFound is millions but I was also trying with rows= 1 Million. I will
reduce it to 500K.

I am sorry. It is state.json. I am using Solr 5.5.0

One of the things I am not able to understand is why my ingestion job is
complaining about "Cannot talk to ZooKeeper - Updates are disabled."

I have a spark streaming job that continuously ingests into Solr. My shards
are always up and running. The moment I start a query on SolrCloud it
starts running into this exception. However as you said ZK will only update
the state of the cluster when the shards go down. Then why my job is trying
to contact ZK when the cluster is up and why is the exception about
updating ZK?


On Fri, Nov 18, 2016 at 5:11 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> The clusterstate on Zookeeper shouldn't be changing
> very often, only when nodes come and go.
>
> bq: At that time I am also running queries (that return
> millions of docs).
>
> As in rows=milions? This is an anti-pattern, if that's true
> then you're probably network saturated and the like. If
> you mean your numFound is millions, then this is unlikely
> to be a problem.
>
> you say "clusterstate.json", which indicates you're on
> 4x? This has been changed to make a state.json for
> each collection, so either you upgraded sometime and
> didn't transform you ZK (there's a command to do that)
> or can you upgrade?
>
> What I'm guessing is that you have too much going on
> somehow and you're overloading your system and
> getting a timeout. So increasing the timeout
> is definitely a possibility, or reducing the ingestion load
> as a test.
>
> Best,
> Erick
>
> On Fri, Nov 18, 2016 at 4:51 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Hi,
> >
> > I have a SolrCloud (on HDFS) of 50 nodes and a ZK quorum of 5 nodes. The
> > SolrCloud is having difficulties talking to ZK when I am ingesting data
> > into the collections. At that time I am also running queries (that return
> > millions of docs). The ingest job is crying with the the following
> exception
> >
> > org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
> > from server at http://xxx/solr/collection1_shard15_replica1: Cannot
> talk to
> > ZooKeeper - Updates are disabled.
> >
> > I think this is happening when the ingest job is trying to update the
> > clusterstate.json file but the query is reading from that file and thus
> has
> > some kind of a lock on that file. Are there any factors that will cause
> the
> > "READ" to acquire lock for a long time? Is my understanding correct? I am
> > using the cursor approach using SolrJ to get back results from Solr.
> >
> > How often is the ZK updated with the latest cluster state and what
> > parameter governs that? Should I just increase the ZK client timeout so
> > that it retries connecting to the ZK for a longer period of time (right
> now
> > it is 15 seconds)?
> >
> > Thanks!
>


CloudSolrClient$RouteException: Cannot talk to ZooKeeper - Updates are disabled.

2016-11-18 Thread Chetas Joshi
Hi,

I have a SolrCloud (on HDFS) of 50 nodes and a ZK quorum of 5 nodes. The
SolrCloud is having difficulties talking to ZK when I am ingesting data
into the collections. At that time I am also running queries (that return
millions of docs). The ingest job is crying with the the following exception

org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://xxx/solr/collection1_shard15_replica1: Cannot talk to
ZooKeeper - Updates are disabled.

I think this is happening when the ingest job is trying to update the
clusterstate.json file but the query is reading from that file and thus has
some kind of a lock on that file. Are there any factors that will cause the
"READ" to acquire lock for a long time? Is my understanding correct? I am
using the cursor approach using SolrJ to get back results from Solr.

How often is the ZK updated with the latest cluster state and what
parameter governs that? Should I just increase the ZK client timeout so
that it retries connecting to the ZK for a longer period of time (right now
it is 15 seconds)?

Thanks!


Re: index dir of core xxx is already locked.

2016-11-16 Thread Chetas Joshi
I don't kill the solr instance forcefully using "kill -9".

I checked the core.properties file for that shard. The content is different
from the core.properties file for all the other shards.
It has the following two lines which are different

config=solrconfig.xml

schema=schema.xml

In other shards, it is

collection.configName=v4 (name I have given to the config)

name=collectionName_shardNumber_replica1

Should I modify this file before restarting the Cloud?

There is a strange thing I just observed about the data dir of the shard
that is not coming up. There is an addition index dir that has been created

hdfs://Ingest/solr53/collection/core_node32/data/index/index/

The size and content is same as of

hdfs://Ingest/solr53/collection/core_node32/data/index/


What could be the reason of this extra dir? Should I delete it?


Thanks!


On Wed, Nov 16, 2016 at 1:51 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> bq: Before restarting, I delete all the write.lock files from the data
> dir. But
> every time I restart I get the same exception.
>
> First, this shouldn't be necessary. Are you by any chance killing the
> Solr instances with
> the equivalent of "kill -9"? Allow them to shut down gracefully. That
> said, until recently
> the bin/solr script would kill them forcefully after 5 seconds which
> is too short an interval.
>
> But the error really is telling you that somehow two or more Solr
> cores are pointing at the
> same data directory. Whichever one gets there first will block any
> later cores with the
> message you see. So check your core.properties files and your HDFS magic
> to see
> how this is occurring would be my first guess.
>
> Best,
> Erick
>
> On Wed, Nov 16, 2016 at 1:38 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Hi,
> >
> > I have a SolrCloud (on HDFS) of 52 nodes. I have 3 collections each with
> 50
> > shards and maxShards per node for every collection is 1.
> >
> > I am having problem restarting a solr shard for a collection.
> >
> > When I restart, there is always a particular shard of a particular
> > collection that remains down. The 2 shards on the same host for the rest
> of
> > the collections are up and running.
> >
> > Before restarting, I delete all the write.lock files from the data dir.
> But
> > every time I restart I get the same exception.
> >
> > index dir yyy of core xxx is already locked. The most likely cause is
> > another Solr server (or another solr core in this server) also configured
> > to use this directory; other possible causes may be specific to lockType:
> > hdfs
> >
> > Thanks!
>


index dir of core xxx is already locked.

2016-11-16 Thread Chetas Joshi
Hi,

I have a SolrCloud (on HDFS) of 52 nodes. I have 3 collections each with 50
shards and maxShards per node for every collection is 1.

I am having problem restarting a solr shard for a collection.

When I restart, there is always a particular shard of a particular
collection that remains down. The 2 shards on the same host for the rest of
the collections are up and running.

Before restarting, I delete all the write.lock files from the data dir. But
every time I restart I get the same exception.

index dir yyy of core xxx is already locked. The most likely cause is
another Solr server (or another solr core in this server) also configured
to use this directory; other possible causes may be specific to lockType:
hdfs

Thanks!


Re: Sorl shards: very sensitive to swap space usage !?

2016-11-14 Thread Chetas Joshi
Thanks everyone!
The discussion is really helpful.

Hi Toke, can you explain exactly what you mean by "the aggressive IO for
the memory mapping caused the kernel to start swapping parts of the JVM
heap to get better caching of storage data"?
Which JVM are you talking about? Solr shard? I have other services running
on the same host as well.

Thanks!

On Fri, Nov 11, 2016 at 7:32 AM, Shawn Heisey  wrote:

> On 11/11/2016 6:46 AM, Toke Eskildsen wrote:
> > but on two occasions I have
> > experienced heavy swapping with multiple gigabytes free for disk
> > cache. In both cases, the cache-to-index size was fairly low (let's
> > say < 10%). My guess (I don't know the intrinsics of memory mapping
> > vs. swapping) is that the aggressive IO for the memory mapping caused
> > the kernel to start swapping parts of the JVM heap to get better
> > caching of storage data. Yes, with terrible performance as a result.
>
> That's really weird, and sounds like a broken operating system.  I've
> had other issues with swap, but in those cases, free memory was actually
> near zero, and it sounds like your situation was not the same.  So the
> OP here might be having similar problems even if nothing's
> misconfigured.  If so, your solution will probably help them.
>
> > No matter the cause, the swapping problems were "solved" by
> > effectively disabling the swap (swappiness 0).
>
> Solr certainly doesn't need (or even want) swap, if the machine is sized
> right.  I've read some things saying that Linux doesn't behave correctly
> if you completely get rid of all swap, but setting swappiness to zero
> sounds like a good option.  The OS would still utilize swap if it
> actually ran out of physical memory, so you don't lose the safety valve
> that swap normally provides.
>
> Thanks,
> Shawn
>
>


Re: Parallelize Cursor approach

2016-11-14 Thread Chetas Joshi
I got it when you said form N queries. Just wanted to try the "get all
cursorMark first" approach but just realized it would be very inefficient
as you said since cursor mark is serialized version of the last sorted
value you received and hence still you are reading the results from solr
although your "fl" -> null.

Just wanted to try this approach as I need everything sorted. In submitting
N queries, I will have to merge sort the results of N queries. But that
should be way better than the first approach I tried.

Thanks!

On Mon, Nov 14, 2016 at 3:58 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You're executing all the queries to parallelize before even starting.
> Seems very inefficient. My suggestion doesn't require this first step.
> Perhaps it was confusing because I mentioned "your own cursorMark".
> Really I meant bypass that entirely, just form N queries that were
> restricted to N disjoint subsets of the data and process them all in
> parallel, either with /export or /select.
>
> Best,
> Erick
>
> On Mon, Nov 14, 2016 at 3:53 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Thanks Joel for the explanation.
> >
> > Hi Erick,
> >
> > One of the ways I am trying to parallelize the cursor approach is by
> > iterating the result set twice.
> > (1) Once just to get all the cursor marks
> >
> > val q: SolrQuery = new solrj.SolrQuery()
> > q.set("q", query)
> > q.add("fq", query)
> > q.add("rows", batchSize.toString)
> > q.add("collection", collection)
> > q.add("fl", "null")
> > q.add("sort", "id asc")
> >
> > Here I am not asking for any field values ( "fl" -> null )
> >
> > (2) Once I get all the cursor marks, I can start parallel threads to get
> > the results in parallel.
> >
> > However, the first step in fact takes a lot of time. Even more than when
> I
> > would actually iterate through the results with "fl" -> field1, field2,
> > field3
> >
> > Why is this happening?
> >
> > Thanks!
> >
> >
> > On Thu, Nov 10, 2016 at 8:22 PM, Joel Bernstein <joels...@gmail.com>
> wrote:
> >
> >> Solr 5 was very early days for Streaming Expressions. Streaming
> Expressions
> >> and SQL use Java 8 so development switched to the 6.0 branch five months
> >> before the 6.0 release. So there was a very large jump in features and
> bug
> >> fixes from Solr 5 to Solr 6 in Streaming Expressions.
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Thu, Nov 10, 2016 at 11:14 PM, Joel Bernstein <joels...@gmail.com>
> >> wrote:
> >>
> >> > In Solr 5 the /export handler wasn't escaping json text fields, which
> >> > would produce json parse exceptions. This was fixed in Solr 6.0.
> >> >
> >> > Joel Bernstein
> >> > http://joelsolr.blogspot.com/
> >> >
> >> > On Tue, Nov 8, 2016 at 6:17 PM, Erick Erickson <
> erickerick...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hmm, that should work fine. Let us know what the logs show if
> anything
> >> >> because this is weird.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Tue, Nov 8, 2016 at 1:00 PM, Chetas Joshi <chetas.jo...@gmail.com
> >
> >> >> wrote:
> >> >> > Hi Erick,
> >> >> >
> >> >> > This is how I use the streaming approach.
> >> >> >
> >> >> > Here is the solrconfig block.
> >> >> >
> >> >> > 
> >> >> > 
> >> >> > {!xport}
> >> >> > xsort
> >> >> > false
> >> >> > 
> >> >> > 
> >> >> > query
> >> >> > 
> >> >> > 
> >> >> >
> >> >> > And here is the code in which SolrJ is being used.
> >> >> >
> >> >> > String zkHost = args[0];
> >> >> > String collection = args[1];
> >> >> >
> >> >> > Map props = new HashMap();
> >> >> > props.put("q", "*:*");
> >> >> > props.put("qt", "/export");
> >> >> > props.put("sort", "fieldA asc");

Re: Parallelize Cursor approach

2016-11-14 Thread Chetas Joshi
Thanks Joel for the explanation.

Hi Erick,

One of the ways I am trying to parallelize the cursor approach is by
iterating the result set twice.
(1) Once just to get all the cursor marks

val q: SolrQuery = new solrj.SolrQuery()
q.set("q", query)
q.add("fq", query)
q.add("rows", batchSize.toString)
q.add("collection", collection)
q.add("fl", "null")
q.add("sort", "id asc")

Here I am not asking for any field values ( "fl" -> null )

(2) Once I get all the cursor marks, I can start parallel threads to get
the results in parallel.

However, the first step in fact takes a lot of time. Even more than when I
would actually iterate through the results with "fl" -> field1, field2,
field3

Why is this happening?

Thanks!


On Thu, Nov 10, 2016 at 8:22 PM, Joel Bernstein <joels...@gmail.com> wrote:

> Solr 5 was very early days for Streaming Expressions. Streaming Expressions
> and SQL use Java 8 so development switched to the 6.0 branch five months
> before the 6.0 release. So there was a very large jump in features and bug
> fixes from Solr 5 to Solr 6 in Streaming Expressions.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Nov 10, 2016 at 11:14 PM, Joel Bernstein <joels...@gmail.com>
> wrote:
>
> > In Solr 5 the /export handler wasn't escaping json text fields, which
> > would produce json parse exceptions. This was fixed in Solr 6.0.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Tue, Nov 8, 2016 at 6:17 PM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> Hmm, that should work fine. Let us know what the logs show if anything
> >> because this is weird.
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Nov 8, 2016 at 1:00 PM, Chetas Joshi <chetas.jo...@gmail.com>
> >> wrote:
> >> > Hi Erick,
> >> >
> >> > This is how I use the streaming approach.
> >> >
> >> > Here is the solrconfig block.
> >> >
> >> > 
> >> > 
> >> > {!xport}
> >> > xsort
> >> > false
> >> > 
> >> > 
> >> > query
> >> > 
> >> > 
> >> >
> >> > And here is the code in which SolrJ is being used.
> >> >
> >> > String zkHost = args[0];
> >> > String collection = args[1];
> >> >
> >> > Map props = new HashMap();
> >> > props.put("q", "*:*");
> >> > props.put("qt", "/export");
> >> > props.put("sort", "fieldA asc");
> >> > props.put("fl", "fieldA,fieldB,fieldC");
> >> >
> >> > CloudSolrStream cloudstream = new CloudSolrStream(zkHost,collect
> >> ion,props);
> >> >
> >> > And then I iterate through the cloud stream (TupleStream).
> >> > So I am using streaming expressions (SolrJ).
> >> >
> >> > I have not looked at the solr logs while I started getting the JSON
> >> parsing
> >> > exceptions. But I will let you know what I see the next time I run
> into
> >> the
> >> > same exceptions.
> >> >
> >> > Thanks
> >> >
> >> > On Sat, Nov 5, 2016 at 9:32 PM, Erick Erickson <
> erickerick...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Hmmm, export is supposed to handle 10s of million result sets. I know
> >> >> of a situation where the Streaming Aggregation functionality back
> >> >> ported to Solr 4.10 processes on that scale. So do you have any clue
> >> >> what exactly is failing? Is there anything in the Solr logs?
> >> >>
> >> >> _How_ are you using /export, through Streaming Aggregation (SolrJ) or
> >> >> just the raw xport handler? It might be worth trying to do this from
> >> >> SolrJ if you're not, it should be a very quick program to write, just
> >> >> to test we're talking 100 lines max.
> >> >>
> >> >> You could always roll your own cursor mark stuff by partitioning the
> >> >> data amongst N threads/processes if you have any reasonable
> >> >> expectation that you could form filter queries that partition the
> >> >> result set anywhere near evenly.
> >> >>
> >> >> For example, let's say you have a field with

Sorl shards: very sensitive to swap space usage !?

2016-11-10 Thread Chetas Joshi
Hi,

I have a SolrCloud (Solr 5.5.0) of 50 nodes. The JVM heap memory usage of
my solr shards is never more than 50% of the total heap. However, the hosts
on which my solr shards are deployed often run into 99% swap space issue.
This causes the solr shards go down. Why solr shards are so sensitive to
the swap space usage? The JVM heap is more than enough so the shards should
never require the swap space. What could be the reason? Where can find the
reason why the solr shards go down. I don't see anything on the solr logs.

Thanks!


Re: Parallelize Cursor approach

2016-11-08 Thread Chetas Joshi
Hi Erick,

This is how I use the streaming approach.

Here is the solrconfig block.



{!xport}
xsort
false


query



And here is the code in which SolrJ is being used.

String zkHost = args[0];
String collection = args[1];

Map props = new HashMap();
props.put("q", "*:*");
props.put("qt", "/export");
props.put("sort", "fieldA asc");
props.put("fl", "fieldA,fieldB,fieldC");

CloudSolrStream cloudstream = new CloudSolrStream(zkHost,collection,props);

And then I iterate through the cloud stream (TupleStream).
So I am using streaming expressions (SolrJ).

I have not looked at the solr logs while I started getting the JSON parsing
exceptions. But I will let you know what I see the next time I run into the
same exceptions.

Thanks

On Sat, Nov 5, 2016 at 9:32 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Hmmm, export is supposed to handle 10s of million result sets. I know
> of a situation where the Streaming Aggregation functionality back
> ported to Solr 4.10 processes on that scale. So do you have any clue
> what exactly is failing? Is there anything in the Solr logs?
>
> _How_ are you using /export, through Streaming Aggregation (SolrJ) or
> just the raw xport handler? It might be worth trying to do this from
> SolrJ if you're not, it should be a very quick program to write, just
> to test we're talking 100 lines max.
>
> You could always roll your own cursor mark stuff by partitioning the
> data amongst N threads/processes if you have any reasonable
> expectation that you could form filter queries that partition the
> result set anywhere near evenly.
>
> For example, let's say you have a field with random numbers between 0
> and 100. You could spin off 10 cursorMark-aware processes each with
> its own fq clause like
>
> fq=partition_field:[0 TO 10}
> fq=[10 TO 20}
> 
> fq=[90 TO 100]
>
> Note the use of inclusive/exclusive end points
>
> Each one would be totally independent of all others with no
> overlapping documents. And since the fq's would presumably be cached
> you should be able to go as fast as you can drive your cluster. Of
> course you lose query-wide sorting and the like, if that's important
> you'd need to figure something out there.
>
> Do be aware of a potential issue. When regular doc fields are
> returned, for each document returned, a 16K block of data will be
> decompressed to get the stored field data. Streaming Aggregation
> (/xport) reads docValues entries which are held in MMapDirectory space
> so will be much, much faster. As of Solr 5.5. You can override the
> decompression stuff, see:
> https://issues.apache.org/jira/browse/SOLR-8220 for fields that are
> both stored and docvalues...
>
> Best,
> Erick
>
> On Sat, Nov 5, 2016 at 6:41 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Thanks Yonik for the explanation.
> >
> > Hi Erick,
> > I was using the /xport functionality. But it hasn't been stable (Solr
> > 5.5.0). I started running into run time Exceptions (JSON parsing
> > exceptions) while reading the stream of Tuples. This started happening as
> > the size of my collection increased 3 times and I started running queries
> > that return millions of documents (>10mm). I don't know if it is the
> query
> > result size or the actual data size (total number of docs in the
> > collection) that is causing the instability.
> >
> > org.noggit.JSONParser$ParseException: Expected ',' or '}':
> > char=5,position=110938 BEFORE='uuid":"0lG99s8vyaKB2I/
> > I","space":"uuid","timestamp":1 5' AFTER='DB6 474294954},{"uuid":"
> > 0lG99sHT8P5e'
> >
> > I won't be able to move to Solr 6.0 due to some constraints in our
> > production environment and hence moving back to the cursor approach. Do
> you
> > have any other suggestion for me?
> >
> > Thanks,
> > Chetas.
> >
> > On Fri, Nov 4, 2016 at 10:17 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> Have you considered the /xport functionality?
> >>
> >> On Fri, Nov 4, 2016 at 5:56 PM, Yonik Seeley <ysee...@gmail.com> wrote:
> >> > No, you can't get cursor-marks ahead of time.
> >> > They are the serialized representation of the last sort values
> >> > encountered (hence not known ahead of time).
> >> >
> >> > -Yonik
> >> >
> >> >
> >> > On Fri, Nov 4, 2016 at 8:48 PM, Chetas Joshi <chetas.jo...@gmail.com>
> >> wrote:
> >> >> Hi,
> >> 

Re: Re-register a deleted Collection SorlCloud

2016-11-08 Thread Chetas Joshi
I won't be able to achieve the correct mapping as I did not store the
mapping info any where. I don't know if core-node1 was mapped to
shard1_recplica1 or shard2_replica1 in my old collection. But I am not
worried about that as I am not going to update any existing document.

 This is what I did.

I created a new collection with the same schema and the same config.
Shut the SolrCloud down.
Then I copied the data directory.


hadoop fs -cp hdfs://prod/solr53/collection_old/*
hdfs://prod/solr53/collection_new/


Re-started the SolrCloud and I could see documents in the Solr UI when I
queried using the "/select" handler.


Thanks!



On Mon, Nov 7, 2016 at 2:59 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You've got it. You should be quite safe if you
> 1> create the same number of shards as you used to have
> 2> match the shard bits. I.e. collection1_shard1_replica1 as long as
> the collection1_shard# parts match you should be fine. If this isn't
> done correctly, the symptom will be that when you update an existing
> document, you may have two copies returned eventually.
>
> Best,
> Erick
>
> On Mon, Nov 7, 2016 at 1:47 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Thanks Erick.
> >
> > I had replicationFactor=1 in my old collection and going to have the same
> > config for the new collection.
> > When I create a new collection with number of Shards =20 and max shards
> per
> > node = 1, the shards are going to start on 20 hosts out of my 25 hosts
> Solr
> > cluster. When you say "get each shard's index to the corresponding shard
> on
> > your new collection", do you mean the following?
> >
> > shard1_replica1 -> core_node1 (old collection)
> > shard1_replica1 -> has to be core_node1 (new collection) (I don't have
> this
> > mapping for the old collection as the collection no longer exists!!)
> >
> > Thanks,
> > Chetas.
> >
> > On Mon, Nov 7, 2016 at 1:03 PM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> That should work. The caveat here is that you need to get the each
> >> shards index to the corresponding shard on your new collection.
> >>
> >> Of course I'd back up _all_ of these indexes before even starting.
> >>
> >> And one other trick. First create your collection with 1 replica per
> >> shard (leader-only). Then copy the indexes (and, btw, I'd have the
> >> associated Solr nodes down during the copy) and verify the collection
> >> is as you'd expect.
> >>
> >> Now use ADDREPLICA to expand your collection, that'll handle the
> >> copying from the leader correctly.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Nov 7, 2016 at 12:49 PM, Chetas Joshi <chetas.jo...@gmail.com>
> >> wrote:
> >> > I have a Solr Cloud deployed on top of HDFS.
> >> >
> >> > I accidentally deleted a collection using the collection API. So,
> >> ZooKeeper
> >> > cluster has lost all the info related to that collection. I don't
> have a
> >> > backup that I can restore from. However, I have indices and
> transaction
> >> > logs on HDFS.
> >> >
> >> > If I create a new collection and copy the existing data directory to
> the
> >> > data directory path of the new collection I have created, will I be
> able
> >> to
> >> > go back to the state where I was? Is there anything else I would have
> to
> >> do?
> >> >
> >> > Thanks,
> >> >
> >> > Chetas.
> >>
>


Re: Re-register a deleted Collection SorlCloud

2016-11-07 Thread Chetas Joshi
Thanks Erick.

I had replicationFactor=1 in my old collection and going to have the same
config for the new collection.
When I create a new collection with number of Shards =20 and max shards per
node = 1, the shards are going to start on 20 hosts out of my 25 hosts Solr
cluster. When you say "get each shard's index to the corresponding shard on
your new collection", do you mean the following?

shard1_replica1 -> core_node1 (old collection)
shard1_replica1 -> has to be core_node1 (new collection) (I don't have this
mapping for the old collection as the collection no longer exists!!)

Thanks,
Chetas.

On Mon, Nov 7, 2016 at 1:03 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> That should work. The caveat here is that you need to get the each
> shards index to the corresponding shard on your new collection.
>
> Of course I'd back up _all_ of these indexes before even starting.
>
> And one other trick. First create your collection with 1 replica per
> shard (leader-only). Then copy the indexes (and, btw, I'd have the
> associated Solr nodes down during the copy) and verify the collection
> is as you'd expect.
>
> Now use ADDREPLICA to expand your collection, that'll handle the
> copying from the leader correctly.
>
> Best,
> Erick
>
> On Mon, Nov 7, 2016 at 12:49 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > I have a Solr Cloud deployed on top of HDFS.
> >
> > I accidentally deleted a collection using the collection API. So,
> ZooKeeper
> > cluster has lost all the info related to that collection. I don't have a
> > backup that I can restore from. However, I have indices and transaction
> > logs on HDFS.
> >
> > If I create a new collection and copy the existing data directory to the
> > data directory path of the new collection I have created, will I be able
> to
> > go back to the state where I was? Is there anything else I would have to
> do?
> >
> > Thanks,
> >
> > Chetas.
>


Re-register a deleted Collection SorlCloud

2016-11-07 Thread Chetas Joshi
I have a Solr Cloud deployed on top of HDFS.

I accidentally deleted a collection using the collection API. So, ZooKeeper
cluster has lost all the info related to that collection. I don't have a
backup that I can restore from. However, I have indices and transaction
logs on HDFS.

If I create a new collection and copy the existing data directory to the
data directory path of the new collection I have created, will I be able to
go back to the state where I was? Is there anything else I would have to do?

Thanks,

Chetas.


Re: Parallelize Cursor approach

2016-11-05 Thread Chetas Joshi
Thanks Yonik for the explanation.

Hi Erick,
I was using the /xport functionality. But it hasn't been stable (Solr
5.5.0). I started running into run time Exceptions (JSON parsing
exceptions) while reading the stream of Tuples. This started happening as
the size of my collection increased 3 times and I started running queries
that return millions of documents (>10mm). I don't know if it is the query
result size or the actual data size (total number of docs in the
collection) that is causing the instability.

org.noggit.JSONParser$ParseException: Expected ',' or '}':
char=5,position=110938 BEFORE='uuid":"0lG99s8vyaKB2I/
I","space":"uuid","timestamp":1 5' AFTER='DB6 474294954},{"uuid":"
0lG99sHT8P5e'

I won't be able to move to Solr 6.0 due to some constraints in our
production environment and hence moving back to the cursor approach. Do you
have any other suggestion for me?

Thanks,
Chetas.

On Fri, Nov 4, 2016 at 10:17 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Have you considered the /xport functionality?
>
> On Fri, Nov 4, 2016 at 5:56 PM, Yonik Seeley <ysee...@gmail.com> wrote:
> > No, you can't get cursor-marks ahead of time.
> > They are the serialized representation of the last sort values
> > encountered (hence not known ahead of time).
> >
> > -Yonik
> >
> >
> > On Fri, Nov 4, 2016 at 8:48 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> >> Hi,
> >>
> >> I am using the cursor approach to fetch results from Solr (5.5.0). Most
> of
> >> my queries return millions of results. Is there a way I can read the
> pages
> >> in parallel? Is there a way I can get all the cursors well in advance?
> >>
> >> Let's say my query returns 2M documents and I have set rows=100,000.
> >> Can I have multiple threads iterating over different pages like
> >> Thread1 -> docs 1 to 100K
> >> Thread2 -> docs 101K to 200K
> >> ..
> >> ..
> >>
> >> for this to happen, can I get all the cursorMarks for a given query so
> that
> >> I can leverage the following code in parallel
> >>
> >> cursorQ.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark)
> >> val rsp: QueryResponse = c.query(cursorQ)
> >>
> >> Thank you,
> >> Chetas.
>


Parallelize Cursor approach

2016-11-04 Thread Chetas Joshi
Hi,

I am using the cursor approach to fetch results from Solr (5.5.0). Most of
my queries return millions of results. Is there a way I can read the pages
in parallel? Is there a way I can get all the cursors well in advance?

Let's say my query returns 2M documents and I have set rows=100,000.
Can I have multiple threads iterating over different pages like
Thread1 -> docs 1 to 100K
Thread2 -> docs 101K to 200K
..
..

for this to happen, can I get all the cursorMarks for a given query so that
I can leverage the following code in parallel

cursorQ.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark)
val rsp: QueryResponse = c.query(cursorQ)

Thank you,
Chetas.


autoAddReplicas:true not working

2016-10-24 Thread Chetas Joshi
Hello,

I have the following configuration for the Solr cloud and a Solr collection
This is Solr on HDFS and Solr version I am using is 5.5.0

No. of hosts: 52 (Solr Cloud)

shard count:   50
replicationFactor:   1
MaxShardsPerNode: 1
autoAddReplicas:   true

Now, one of my shards is down. Although there are two hosts which are
available in my cloud on which a new replica could be created, it just does
not create a replica. All 52 hosts are healthy. What could be the reason
for this?

Thanks,

Chetas.


Re: /export handler to stream data using CloudSolrStream: JSONParse Exception

2016-10-21 Thread Chetas Joshi
Just to the add to my previous question: I used dynamic shard splitting
while consuming data from the Solr collection using /export handler.

On Fri, Oct 21, 2016 at 2:27 PM, Chetas Joshi <chetas.jo...@gmail.com>
wrote:

> Thanks Joel.
>
> I will migrate to Solr 6.0.0.
>
> However, I have one more question. Have you come across any discussion
> about Spark-on-Solr corrupting the data?
>
> So, I am getting the JSONParse exceptions only for a collection on which I
> tried loading the data using Spark Dataframe API (which internally uses
> /export handler to stream data using CloudSolrStream).
>
> The data loading using CloudSolrStream API from all the other collections
> works fine.
>
> Just want to know if you have come across this issue.
>
> Thanks,
>
> Chetas.
>
>
>
> On Thu, Oct 20, 2016 at 7:03 PM, Joel Bernstein <joels...@gmail.com>
> wrote:
>
>> I suspect this is a bug with improperly escaped json. SOLR-7441
>> <https://issues.apache.org/jira/browse/SOLR-7441> resolved this issue and
>> released in Solr 6.0.
>>
>> There have been a large number of improvements, bug fixes, new features
>> and
>> much better error handling in Solr 6 Streaming Expressions.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Thu, Oct 20, 2016 at 5:49 PM, Chetas Joshi <chetas.jo...@gmail.com>
>> wrote:
>>
>> > Hello,
>> >
>> > I am using /export handler to stream data using CloudSolrStream.
>> >
>> > I am using fl=uuid,space,timestamp where uuid and space are Strings and
>> > timestamp is long. My query (q=...) is not on these fields.
>> >
>> > While reading the results from the Solr cloud, I get the following
>> errors
>> >
>> > org.noggit.JSONParser$ParseException: Expected ',' or '}':
>> > char=5,position=110938
>> > BEFORE='uuid":"0lG99s8vyaKB2I/I","space":"uuid","timestamp":1 5'
>> > AFTER='DB6
>> > 474294954},{"uuid":"0lG99sHT8P5e'
>> >
>> >
>> > Or (For a different query
>> >
>> >
>> > org.noggit.JSONParser$ParseException: Expected ',' or '}':
>> > char=",position=122528
>> > BEFORE=':1475618674},{"uuid":"Whz991tX6P4beuhp","space": 3076 "'
>> > AFTER='uuid","timestamp":1476131442},{"uui'
>> >
>> >
>> > Now what are the possible reasons of me getting this error?
>> >
>> >
>> > Is this related to some kind of data corruption?
>> >
>> >
>> > What are some of the things (possibly some characters in String) that
>> JSON
>> > will have hard time parsing?
>> >
>> >
>> > The Solr version I use is 5.5.0
>> >
>> >
>> > Thanks
>> >
>> >
>> > Chetas.
>> >
>>
>
>


Re: /export handler to stream data using CloudSolrStream: JSONParse Exception

2016-10-21 Thread Chetas Joshi
Thanks Joel.

I will migrate to Solr 6.0.0.

However, I have one more question. Have you come across any discussion
about Spark-on-Solr corrupting the data?

So, I am getting the JSONParse exceptions only for a collection on which I
tried loading the data using Spark Dataframe API (which internally uses
/export handler to stream data using CloudSolrStream).

The data loading using CloudSolrStream API from all the other collections
works fine.

Just want to know if you have come across this issue.

Thanks,

Chetas.



On Thu, Oct 20, 2016 at 7:03 PM, Joel Bernstein <joels...@gmail.com> wrote:

> I suspect this is a bug with improperly escaped json. SOLR-7441
> <https://issues.apache.org/jira/browse/SOLR-7441> resolved this issue and
> released in Solr 6.0.
>
> There have been a large number of improvements, bug fixes, new features and
> much better error handling in Solr 6 Streaming Expressions.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Oct 20, 2016 at 5:49 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
>
> > Hello,
> >
> > I am using /export handler to stream data using CloudSolrStream.
> >
> > I am using fl=uuid,space,timestamp where uuid and space are Strings and
> > timestamp is long. My query (q=...) is not on these fields.
> >
> > While reading the results from the Solr cloud, I get the following errors
> >
> > org.noggit.JSONParser$ParseException: Expected ',' or '}':
> > char=5,position=110938
> > BEFORE='uuid":"0lG99s8vyaKB2I/I","space":"uuid","timestamp":1 5'
> > AFTER='DB6
> > 474294954},{"uuid":"0lG99sHT8P5e'
> >
> >
> > Or (For a different query
> >
> >
> > org.noggit.JSONParser$ParseException: Expected ',' or '}':
> > char=",position=122528
> > BEFORE=':1475618674},{"uuid":"Whz991tX6P4beuhp","space": 3076 "'
> > AFTER='uuid","timestamp":1476131442},{"uui'
> >
> >
> > Now what are the possible reasons of me getting this error?
> >
> >
> > Is this related to some kind of data corruption?
> >
> >
> > What are some of the things (possibly some characters in String) that
> JSON
> > will have hard time parsing?
> >
> >
> > The Solr version I use is 5.5.0
> >
> >
> > Thanks
> >
> >
> > Chetas.
> >
>


Re: For TTL, does expirationFieldName need to be indexed?

2016-10-20 Thread Chetas Joshi
You just need to have indexed=true. It will use the inverted index to
delete the expired documents. You don't need stored=true as all the info
required by the DocExpirationUpdateProcessorFactory to delete a document is
there in the inverted index.

On Thu, Oct 20, 2016 at 4:26 PM, Brent  wrote:

> Thanks for the reply.
>
> Follow up:
> Do I need to have the field stored? While I don't need to ever look at the
> field's original contents, I'm guessing that the
> DocExpirationUpdateProcessorFactory does, so that would mean I need to
> have
> stored=true as well, correct?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/For-TTL-does-expirationFieldName-need-to-
> be-indexed-tp4301522p4302386.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


/export handler to stream data using CloudSolrStream: JSONParse Exception

2016-10-20 Thread Chetas Joshi
Hello,

I am using /export handler to stream data using CloudSolrStream.

I am using fl=uuid,space,timestamp where uuid and space are Strings and
timestamp is long. My query (q=...) is not on these fields.

While reading the results from the Solr cloud, I get the following errors

org.noggit.JSONParser$ParseException: Expected ',' or '}':
char=5,position=110938
BEFORE='uuid":"0lG99s8vyaKB2I/I","space":"uuid","timestamp":1 5' AFTER='DB6
474294954},{"uuid":"0lG99sHT8P5e'


Or (For a different query


org.noggit.JSONParser$ParseException: Expected ',' or '}':
char=",position=122528
BEFORE=':1475618674},{"uuid":"Whz991tX6P4beuhp","space": 3076 "'
AFTER='uuid","timestamp":1476131442},{"uui'


Now what are the possible reasons of me getting this error?


Is this related to some kind of data corruption?


What are some of the things (possibly some characters in String) that JSON
will have hard time parsing?


The Solr version I use is 5.5.0


Thanks


Chetas.


Re: Solr on HDFS: adding a shard replica

2016-09-13 Thread Chetas Joshi
Is this happening because I have set replicationFactor=1?
So even if I manually add replica for the shard that's down, it will just
create a dataDir but would not copy any of the data into the dataDir?

On Tue, Sep 13, 2016 at 6:07 PM, Chetas Joshi <chetas.jo...@gmail.com>
wrote:

> Hi,
>
> I just started experimenting with solr cloud.
>
> I have a solr cloud of 20 nodes. I have one collection with 18 shards
> running on 18 different nodes with replication factor=1.
>
> When one of my shards goes down, I create a replica using the Solr UI. On
> HDFS I see a core getting added. But the data (index table and tlog)
> information does not get copied over to that directory. For example, on
> HDFS I have
>
> /solr/collection/core_node_1/data/index
> /solr/collection/core_node_1/data/tlog
>
> when I create a replica of a shard, it creates
>
> /solr/collection/core_node_19/data/index
> /solr/collection/core_node_19/data/tlog
>
> (core_node_19 as I already have 18 shards for the collection). The issue
> is both my folders  core_node_19/data/index and core_node_19/data/tlog are
> empty. Data does not get copied over from core_node_1/data/index and
> core_node_1/data/tlog.
>
> I need to remove core_node_1 and just keep core_node_19 (the replica). Why
> the data is not getting copied over? Do I need to manually move all the
> data from one folder to the other?
>
> Thank you,
> Chetas.
>
>


Solr on HDFS: adding a shard replica

2016-09-13 Thread Chetas Joshi
Hi,

I just started experimenting with solr cloud.

I have a solr cloud of 20 nodes. I have one collection with 18 shards
running on 18 different nodes with replication factor=1.

When one of my shards goes down, I create a replica using the Solr UI. On
HDFS I see a core getting added. But the data (index table and tlog)
information does not get copied over to that directory. For example, on
HDFS I have

/solr/collection/core_node_1/data/index
/solr/collection/core_node_1/data/tlog

when I create a replica of a shard, it creates

/solr/collection/core_node_19/data/index
/solr/collection/core_node_19/data/tlog

(core_node_19 as I already have 18 shards for the collection). The issue is
both my folders  core_node_19/data/index and core_node_19/data/tlog are
empty. Data does not get copied over from core_node_1/data/index and
core_node_1/data/tlog.

I need to remove core_node_1 and just keep core_node_19 (the replica). Why
the data is not getting copied over? Do I need to manually move all the
data from one folder to the other?

Thank you,
Chetas.