I think I've already mentioned that the new query is different.
The reference to 8.2.1 is included here where also the old query can be
found:
https://www.mail-archive.com/basex-talk%40mailman.uni-konstanz.de/msg06544.html
With kind regards,
Menashè
On 08/03/2015 03:38 PM, Christian Grün wrot
>> What was the last version it was working with?
> 8.2.1. Not really working, but better...
I ran the attached query with 8.2.1, and no index was used either. Are
you sure you sent me the correct query?
Sorry for confronting you with all those questions, but to help you, I
really need your help
On 08/03/2015 03:24 PM, Christian Grün wrote:
What was the last version it was working with?
8.2.1. Not really working, but better...
> I'm have again performance problems. I have BaseX 8.2.2.
What was the last version it was working with?
:)
I've thought to do it as a second step, but I should do it earlier.
Thank you.
With kind regards,
Menashè
On 07/14/2015 03:22 PM, Christian Grün wrote:
...it only makes sense if you store the data in its normalized representation.
On Tue, Jul 14, 2015 at 2:42 PM, Menashè Eliezer
wrote:
H
...it only makes sense if you store the data in its normalized representation.
On Tue, Jul 14, 2015 at 2:42 PM, Menashè Eliezer
wrote:
> Hi,
> It sounds like a great idea and I can also implement it to the date
> comparisons, but unfortunately the new query is much slower.
> Please see the attac
Hi,
It sounds like a great idea and I can also implement it to the date
comparisons, but unfortunately the new query is much slower.
Please see the attached log.
With kind regards,
Menashè
On 07/14/2015 12:50 PM, Christian Grün wrote:
Should geo:within of http://docs.basex.org/wiki/Geo_Module
> Should geo:within of http://docs.basex.org/wiki/Geo_Module help?
The functions of the Geo Module don't use any index structures, so I
am afraid they won't speed up the query.
One more idea: you could convert all latitudes and longitudes to
strings with a fixed number of digits
_
Hi,
On 07/14/2015 11:05 AM, Christian Grün wrote:
It may be slightly faster if you remove the explicit string() conversion
No, it's actually slower.
But please note that BaseX provides no native range index, which would
be a good fit for your longitude/latitude filter.
Should *geo:within *of
> oops, I'm sorry. It's attached.
> There are text and attribute indexes.
It may be slightly faster if you remove the explicit string() conversion:
for $x in db:open("CDI")
let $beginPosition := $x//startTime
where $beginPosition >= "1889-01-01" and
$beginPosition <= "2015-07-10"
Hi Christian,
oops, I'm sorry. It's attached.
There are text and attribute indexes.
With kind regards,
Menashè
On 07/14/2015 09:32 AM, Christian Grün wrote:
Hi Menashè,
The attached log file is empty. Maybe it's sufficient if you provide
us with the query and give us information on the query c
Hi Menashè,
The attached log file is empty. Maybe it's sufficient if you provide
us with the query and give us information on the query compilation
(are any indexes used?).
C.
On Mon, Jul 13, 2015 at 3:32 PM, Menashè Eliezer
wrote:
> Hello,
> Creating a database of partial xml documents had al
Hello,
Creating a database of partial xml documents had almost no effect.
Therefore I've created a database with very simple xml structure. I'm
attaching an example (demo.xml).
BaseX version: 8.2.2
Number of documents: 374739
However, the attached query takes 4 seconds (attached simple_query.lo
Hi,
Just create a new database from the input data with this option turned on.
I've expected db:add to do it. Not important.
If it's not well-formed, you can't store it in BaseX.. If you can do
so, it would be an error (and rather surprising to me ;).
Well, the not well-formed is the respons
> Anyway, how can I strip the namespaces in my new database? I don't need
> them.
Just create a new database from the input data with this option turned on.
> I've used db:add. The data is not well-formed. It's just like copy&paste of
> the relevant xml. No headers.
If it's not well-formed, you
Hi Christian,
True. I forgot to mention that the 'stripns' option (as all other XML
parsing options [1]) only applies to newly parsed XML strings.
But these strings belong to new documents being added using db:add.
Anyway, how can I strip the namespaces in my new database? I don't need
them.
Hi Menashè,
> I've used map { 'stripns': true(), 'intparse': true() }) in db:add, but the
> namespaces were not removed, e.g. there is gml:beginPosition.
True. I forgot to mention that the 'stripns' option (as all other XML
parsing options [1]) only applies to newly parsed XML strings.
> Anyway
Hi Christian,
I've created a new database with only the relevant part of each xml.
It's much smaller and I hope it would help.
The created xml is not a valid one since the xml and xml-model tags are
missing, but it shouldn't be a problem.
I've used map { 'stripns': true(), 'intparse': true() }
> I couldn't find an option in db:add to specificy an XPath. In my case, I
> need to extract only the elements under
> /gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification
We try to avoid XPath strings arguments whenever possible. Instead,
simply use XQuery, which allows you to do al
Hi Christian,
The usual approach is to simply create another database that only
contains the relevant parts of your document. This can directly be
done in XQuery (using db:create, db:add, ...), or, if memory
consumption is too high, by exporting and importing parts of your
document.
I couldn't fi
Thank you Christian for the helpful reply.
With kind regards,
Menashè
On 06/23/2015 01:32 PM, Christian Grün wrote:
Is there also an option to define inside the part only the xpaths which I would
need?
I guess no, but to be honest, I am not exactly sure what you mean?
Would you like to restri
> Is there also an option to define inside the part only the xpaths which I
> would need?
I guess no, but to be honest, I am not exactly sure what you mean?
Would you like to restrict indexing to specific parts of the document?
In that case, you'll have to wait for someone implementing [1]
(contr
Thank you Christian,
I may try it later as a last option. I hope you can find an alternative
solution.
Is there also an option to define inside the part only the xpaths which
I would need?
Otherwise, many elements and attributes which I don't need are being
indexed.
Another question, how can
> Is there an option to ask BaseX to parse only a part of the imported xml
> files under a specific xpath, (or at least limit useless indexing of non
> relevant components)? I don't need the rest of the xml files, even though
> it's not too big. Maybe it can help.
The usual approach is to simply c
Hi Menashè,
I am not sure if I can propose any way out, because there are too many
factors that would need to be looked at right now (automatically
composed queries, no node ids, gigabytes of data, ...).
So let's maybe go back to your original observation:
> Once I specific XPath, it seems that
Hi Christian,
Even when I leave only the first filter and test it as standalone it
takes more than 8 seconds:
Result:
- Hit(s): 25 Items
- Updated: 0 Items
- Printed: 2048 KB
- Read Locking: local [CDI]
- Write Locking: none
Timing:
- Parsing: 2.0 ms
- Compiling:
Hi Menashè,
>> QUERY[0] xquery version "3.0"; declare namespace queryName ='GetIDS';
>> declare namespace gco = "http://www.isotc211.org/2005/gco";; declare
>> [...]
It would be great if you could help us and simplify the query, such
that we can have a look at the core issue.
>> Id there an und
Hi,
I've used ssh -X for producing query info right from the server machine.
Please see attached.
I hope it would help.
With kind regards,
Menashè
On 06/22/2015 04:48 PM, Menashè Eliezer wrote:
Hi Christian,
I'm have again performance problems. I have BaseX 8.2.1.
As you may remember, you've
Hi Christian,
I'm have again performance problems. I have BaseX 8.2.1.
As you may remember, you've recommended changing
'for $x in collection("CDI")' to 'for $x in
collection("CDI")/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification'.
However, I've discovered I cannot specify XPath
;) Looks good!
Thanks for the updated report,
Christian
On Tue, Feb 3, 2015 at 1:13 PM, Menashè Eliezer wrote:
> Hi Christian,
>
> Thank you! The performance arrives to 0.5 sec!
>
> The biggest improvement is related to the query rephrasing you've suggested.
> Then the latest snapshot also help
Hi Christian,
Thank you! The performance arrives to 0.5 sec!
The biggest improvement is related to the query rephrasing you've suggested.
Then the latest snapshot also helps a lot!
You may want to know that in the log of the latest snapshot I see
applying attribute index for "7827"
which is not
Hi Christian,
Thank you very much! Unfortunately I'll be at the office only tomorrow.
Menashè
On Sat, 31 Jan 2015 16:42:32 +0100, Christian Grün
wrote:
> Hi Menashè,
>
> With the latest snapshot [1], your original query should now be
> rewritten for index access as well. Looking forward to you
Hi Menashè,
With the latest snapshot [1], your original query should now be
rewritten for index access as well. Looking forward to your tests,
Christian
PS: In terms of performance, it may still be worthwhile to move
redundant paths to the for clause; but just try and see.
[1] http://files.base
Hi Menashè,
> Should I expect to see the usage of an index for each of the where phrases?
Usually, only one predicate will be rewritten for index access, and
the remaining conditions will be answered sequentially.
> Have a nice weekend!
Enjoy,
Christian
> Menashè
>
> On Fri, 30 Jan 2015 18:11
Hi Christian,
Interesting! I'll check it when I'm back at the office and keep you
updated.
I'll use for $x in
collection("ALL-CDIS")/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification
as you've suggested.
Should I expect to see the usage of an index for each of the where phrases?
H
Hi Menashè,
Thanks for the XML samples you sent me in private. I noticed that the
index rewritings will only be triggered if you formulate your query as
follows:
OLD:
for $x in collection("ALL-CDIS")
where $x/gmd:MD_Metadata/gmd:identificationInfo/...
return ...
NEW:
for $x in collection
Could you possibly provide me with a small snapshot of your data
sources (one, two documents might be sufficient)?
On Fri, Jan 30, 2015 at 5:52 PM, Menashè Eliezer
wrote:
> Almost the same speed with version 8.0.
> No indexing (no "applying" in the query info).
> As I've attached before, indexes
Almost the same speed with version 8.0.
No indexing (no "applying" in the query info).
As I've attached before, indexes are active for this DB.
With kind regards,
Menashè
On 01/30/2015 05:31 PM, Christian Grün wrote:
It's indeed interesting that your query does not use any of the
existing index
It's indeed interesting that your query does not use any of the
existing index structures (if they did, you would find strings like
"applying text index" or "applying attribute index" in the query
info). Maybe/hopefully things look different with Version 8.0.
On Fri, Jan 30, 2015 at 5:26 PM, Mena
On 01/30/2015 05:18 PM, Christian Grün wrote:
/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification/gmd:descriptiveKeywords[1]/gmd:MD_Keywords/gmd:keyword[2]/sdn:SDN_ParameterDiscoveryCode/@codeListValue
How can I remove *?
Simply remove the predicate; a[*]/b is the same as a/b.
Ma
/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification/gmd:descriptiveKeywords[1]/gmd:MD_Keywords/gmd:keyword[2]/sdn:SDN_ParameterDiscoveryCode/@codeListValue
> How can I remove *?
Simply remove the predicate; a[*]/b is the same as a/b.
>> * In some cases, if you know that an element
Hi Christian,
Thank you for your reply. Updated files are attached.
On 01/30/2015 04:35 PM, Christian Grün wrote:
Hi Menashè,
First of all, I wonder if your query really does what you want it to
do. I noticed for example that some of the where conditions start with
"$x/", while others start
Hi Menashè,
First of all, I wonder if your query really does what you want it to
do. I noticed for example that some of the where conditions start with
"$x/", while others start with "/" and some others start with no
slash. Is this intentional?
Some more comments:
* I would recommend you to avoi
Hello,
I wonder if the attached query can be optimised. I'm attaching all
relevant information.
Basex 7.9, Debian, powerful server.
This is just an example. The queries will be built based on a
compilation of a search form.
So reordering the conditions for having smaller subset right from the
Am 19.11.2012 um 23:00 schrieb Christian Grün:
>>let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node)
>> /ancestor::*:TEI[1]//*:fileDesc)[1]//*:titleStmt[1]//*:title[1]
>
> Do you think that the following query would return the expected result?
>
> db:open-id('TG-DTA-GerManC-ste
> let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node)
> /ancestor::*:TEI[1]//*:fileDesc)[1]//*:titleStmt[1]//*:title[1]
Do you think that the following query would return the expected result?
db:open-id('TG-DTA-GerManC-stemming-ws', $node)/
ancestor::*:TEI[1]/
descendant
Hi Christian,
Am 18.11.2012 um 17:07 schrieb Christian Grün:
> it looks as the query plan is still based on the nested predicates.
> Have you checked if the simplified form leads to the usage of index
> structures (provided that you have up-to-date index structures at this
> stage)?
I think, it
Hi Cerstin,
> OK, here is the query info. Most time is used for evaluation, also printing
> takes some time, but parsing and compiling looks pretty fast, I think.
it looks as the query plan is still based on the nested predicates.
Have you checked if the simplified form leads to the usage of inde
Hi Christian,
Am 15.11.2012 um 20:00 schrieb Christian Grün:
for $i at $p in //entry[phraseme[text() = "Ad0194"] and selected[text() =
"yes"]]
It’s often beneficial to avoid nested predicated. Does the following
version give you better results?
//entry[phraseme/text() = "Ad0194" and selected
> for $i at $p in //entry[phraseme[text() = "Ad0194"] and selected[text() =
> "yes"]]
It’s often beneficial to avoid nested predicated. Does the following
version give you better results?
//entry[phraseme/text() = "Ad0194" and selected/text() = "yes"]
Beside that, feel free to send us the qu
Hi,
I have a problem with slow queries. I'm not sure if it is due to the
construction of the query or if there is something else going on.
I'm querying two databases: "collect" which is opened at the beginning and
contains several thousands entries like these two:
40177618
[text() contain
51 matches
Mail list logo