Parallel SQL Interface and 'qt'

2021-03-03 Thread Jostein Elvaker Haande
Hi,

I've just started to look into the Parallel SQL interface available in
SOLR. I've done some tests across a few collections, and it works fairly
well.

However I've run into an issue with a few collections where the SQL
interface does not return any data. Now according to the documentation, it
seems like it relies on using the default /select handler to lookup and
return data. However the collections that do not return data in the SQL
interface, are all using /select handlers that have some logic that
requires additional parameters to return data.

I know that if you define 'handleSelect' to true in the /select handler,
you can pass the 'qt' parameter to define which handler to use on the fly.
The handler in question is configured to use 'handleSelect', however when I
pass the 'qt' parameter to the /sql handler, it does not seem to work.

I've gone through the documentation, however I can't find any information
in this regard. Is it possible to define which handler to use when using
the /sql handler?

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

https://tolecnal.net <http://tolecnal.net> -- tolecnal at tolecnal dot net


Re: Get transaction count from ZooKeeper transaction logs

2018-07-10 Thread Jostein Elvaker Haande
On Wed, 11 Jul 2018 at 02:48, Shawn Heisey  wrote:

> On 7/10/2018 3:32 PM, Jostein Elvaker Haande wrote:
> > I'm trying to find an effective way to find the number of transactions
> You have detailed questions about the inner workings of ZooKeeper.  This
> is not a ZooKeeper mailing list.  It is a Solr mailing list.  The
> ZooKeeper project has its own support resources.  That project is going
> to be in a far better position to have access to the information that
> you need.
>

Thanks Shawn, I've redirect my question to this mailing list.

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net


Get transaction count from ZooKeeper transaction logs

2018-07-10 Thread Jostein Elvaker Haande
 Hello,

I'm trying to find an effective way to find the number of transactions
stored in the ZooKeeper transaction logs. The only method I've found so far
is by using the Java class 'org.apache.zookeeper.server.LogFormatter' which
outputs the following after it has formatted a log file:

  EOF reached after 203 txns.

Now I could of course make a script to process each log file through this
log formatter, and extract the count from the last line of stdout, but I'm
wondering if there's an easier method.

I've read through the ZK documentation, and tried the ZK commands (aka The
Four Letter Words) to see if any of these offer this metric, but I could
not see it.

So my question is - is there a simpler approach to find this count?

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net


Re: Deleted documents and expungeDeletes

2016-04-01 Thread Jostein Elvaker Haande
On 30 March 2016 at 17:46, Erick Erickson  wrote:
> through a clever bit of reflection, you can set the
> reclaimDeletesWeight variable from solrconfig by including something
> like
> 5 (going from memory
> here, you'll get an error on startup if I've messed it up.)

I added the following to my solrconfig a couple of days ago:


  8
  8
  5.0


There has been several commits and the core is current according to
SOLR admin, however I'm still seeing a lot of deleted docs. These are
my current core statistics.

Last Modified:4 minutes ago
Num Docs:1 675 255
Max Doc:2 353 476
Heap Memory Usage:208 464 267
Deleted Docs:678 221
Version:1 870 539
Segment Count:39

Index size is close to 149GB.

So at the moment, I'm seeing a deleted docs to max docs percentage
ratio of 28.81%. With 'reclaimsWeight' set to 5, it doesn't seem to be
deleting away any deleted docs.

Anything obvious I'm missing?

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net


Re: Deleted documents and expungeDeletes

2016-03-30 Thread Jostein Elvaker Haande
On 30 March 2016 at 12:25, Markus Jelsma  wrote:
> Hello - with TieredMergePolicy and default reclaimDeletesWeight of 2.0, and 
> frequent updates, it is not uncommon to see a ratio of 25%. If you want 
> deletes to be reclaimed more often, e.g. weight of 4.0, you will see very 
> frequent merging of large segments, killing performance if you are on 
> spinning disks.

Most of our installations are on spinning disks, so if I want a more
aggressive reclaim, this will impact performance. This is of course
something that I do not desire, so I'm wondering if scheduling a
commit with 'expungeDeletes' during off peak business hours is a
better approach than setting up a more aggressive merge policy.

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net


Re: Deleted documents and expungeDeletes

2016-03-30 Thread Jostein Elvaker Haande
On 30 March 2016 at 02:49, Erick Erickson  wrote:
> Please specify "growing and growing", Until it gets to 15% or more of the 
> total
> then I'd start to worry. And then only if it kept growing after that.

I tested 'expungeDeletes' on four different cores, three of them were
nearly identical in terms of numbers. Max Docs were around ~2.2M, Num
Docs was ~1.6M and Deleted Docs were ~600K - so the percentage of
Deleted Docs were around the ~27 percent mark. So according to your
feedback, I should start to worry! Now the question is, why aren't the
Deleted Docs being merged away if this is in fact supposed to happen?

> 1> This is automatic. It'll "just happen", but you will probably always carry
> some deleted docs around in your index.

Yeah, that I am aware of - I noticed that even after running
'expungeDeletes' I had a few thousand docs left, which is acceptable
and does not worry me.

> 4> True, but usually the effect is so minuscule that nobody notices.
> People spend
> endless time obsessing about this and unless and until you can show that your
> _users_ notice, I'd ignore it.

Hehe, then I'll refrain from being one of those that obsess over this.
As long as I know the effect it has is minuscule, then I'll just toss
the thought in the bin.

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net


Deleted documents and expungeDeletes

2016-03-29 Thread Jostein Elvaker Haande
Hello everyone,

I apologise beforehand if this is a question that has been visited
numerous times on this list, but after hours spent on Google and
talking to SOLR savvy people on #solr @ Freenode I'm still a bit at a
loss about SOLR and deleted documents.

I have quite a few indexes in both production and development
environments, where I see that the number of deleted documents just
keeps on growing and growing, but they never seem to be deleted. From
my understanding, this can be controller in the merge policy set for
the current core, but I've not been able to find any specifics on the
topic.

The general consensus on most search hits I've found is to perform an
optimize of the core, however this is both an expensive operation,
both in terms of CPU cycles as well as disk I/O, and also requires you
to have anywhere from 2 times to 3 times the size of the index
available on disk to be guaranteed to complete fully. Given these
criteria, it's often not something that is a viable option in certain
environments, both to it being a resource hog and often that you just
don't have the needed available disk space to perform the optimize.

After having spoken with a couple of people on IRC (thanks tokee and
elyograg), I was made aware of an optional parameter for 
called 'expungeDeletes' that can explicitly make sure that deleted
documents are deleted from the index, i.e:

curl http://localhost:8983/solr/coreName/update -H "Content-Type:
text/xml" --data-binary ''

Now my questions are as follows:

1) How can I make sure that this is dealt with in my merge policy, if
at all possible?
2) I've tried to find some disk space guidelines for 'expungeDeletes',
however I've not been able to find any. What are the general
guidelines here? Does it require as much space as an optimize, or is
it less "aggressive" compared to an optimize?
3) Is 'expungeDeletes' the recommended method to make sure your
deleted documents are actually removed from the index, or should you
deal with this in your merge policy?
4) I have also heard from talks on #SOLR that deleted documents has an
impact on the relevancy of performed searches. Is this correct, or
just misinformation?

If you require any additional information, like snippets from my
configuration (solrconfig.xml), I'm more than happy to provide this.

Again, if this is an issue that's being revisited for the Nth time, I
apologize, I'm just trying to get my head around this with my somewhat
limited SOLR knowledge.

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

http://tolecnal.net -- tolecnal at tolecnal dot net