Re: [MarkLogic Dev General] Seg fault when running xdmp:tidy

2014-06-04 Thread Michael Blakeley
Yep, tidy is not very tidy. It's been a good source of crashes for as long as I've been aware of it. Anyone running MarkLogic on OSX might like to know how to simulate pstack on OSX with a simple script. This requires gdb, of course. The output can be very useful for reporting bugs, and sometim

Re: [MarkLogic Dev General] Seg fault when running xdmp:tidy

2014-06-04 Thread David Ennis
Confirmed on my version of Linux as well. A small note: With pstack installed, there is really not much more ot go on: crawl: Input/output error Error tracing through process 24904 24904: /opt/MarkLogic/bin/MarkLogic So, pstack itself can't seem to trace through the MarkLogic process (bec

[MarkLogic Dev General] Seg fault when running xdmp:tidy

2014-06-04 Thread Jakob Fix
try for yourself (7.0-2.3 on Mac): let $u := " http://www.larepublica.co/ocde-recomienda-fortalecer-el-sistema-de-planeaci%C3%B3n_124391 " let $c := xdmp:http-get($u, false UTF-8 full ) let $d := xdmp:tidy($c[2], section, header, time, figure, nav, article yes yes yes )[2] return $d log s

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
Thanks Wayne. --- Ron Hitchens {r...@overstory.co.uk} +44 7879 358212 On Jun 4, 2014, at 11:12 PM, Wayne Feick wrote: > Fair points, Ron. We have RFE 2322 filed back in Feb 2012 to track this. I'll > add a note indicating your interest as well. > > Wayne. > > > On 06/04/2014 03:00 PM,

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
Unless your unique-uri() function is running in a non-update query, in which case it runs lock free at a timestamp. If you're using the pattern of main code as a query and updates delegated to invoked/eval'ed transactions, you could get bit by this. It would work fine the vast majority of

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Wayne Feick
Fair points, Ron. We have RFE 2322 filed back in Feb 2012 to track this. I'll add a note indicating your interest as well. Wayne. On 06/04/2014 03:00 PM, Ron Hitchens wrote: Wayne, Thanks for this. It's a useful code pattern for this sort of thing and I will probably use it for the spe

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
Wayne, Thanks for this. It's a useful code pattern for this sort of thing and I will probably use it for the specific requirement I have at the moment (I was planning to do something similar anyway). But this code, or any user-level code, does not fully implement the uniqueness guarant

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Analyze That | Johan van den Brink
Thanks David. Met vriendelijke groet, Johan van den Brink Consultant Analyze That - Analytics | Data Integration | Reporting | Process Mining Kerkewijk 8 3901 EG Veenendaal T: (06) 49 92 30 30 T: (0318) 52 55 87 M: jo...@analyzethat.nl W: www.analyzethat.nl

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread David Ennis
HI. I believe you can do that here: http://developer.marklogic.com/mailman/listinfo/general Kind Regards, David Ennis On 4 June 2014 23:09, Analyze That | Johan van den Brink < jo...@analyzethat.nl> wrote: > Hi guys, > > How can I unsubscribe from this mailing list? > > > Met vriendelijke gro

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Analyze That | Johan van den Brink
Hi guys, How can I unsubscribe from this mailing list? Met vriendelijke groet, Johan van den Brink Consultant Analyze That - Analytics | Data Integration | Reporting | Process Mining Kerkewijk 8 3901 EG Veenendaal T: (06) 49 92 30 30 T: (0318) 52 55 87 M: jo...@analyzethat.nl W: www.analyzeth

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Wayne Feick
The simplest is to have the document URI correspond to the element value, and if you can use a random value it's good for concurrency. If you can't do that, but you want to ensure only one document can have a particular value for an element, I think it's pretty easy using xdmp:lock-for-update(

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Whitby, Rob
I thought 2 simultaneous transactions would both get read locks on the uri, then one would get a write lock and the other would fail and retry. Maybe I'm missing something though. But anyway, I agree unique indexes would be a handy feature. e.g. our docs have a DOI element which *should* be uni

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
John, What I'm interested in is not so much a strategy for generating unique IDs but rather imposing a constraint on the content in the database. Yes, there are ways of creating IDs that are very unlikely to clash, but that's not really the crux of the new feature I'm suggesting. I wan

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread John Snelson
On 04/06/2014 19:31, Ron Hitchens wrote: > In my case, the naming space is actually quite small because I want the > IDs to be meaningful but unique. For example "images:cats:fluffy:XX.png", > where XX can increment or be set randomly until the ID is unique. Make XX a random number. Or two

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread John Snelson
Maybe you could consider using sem:uuid() in MarkLogic 7? You are much better off with a statistically unique ID than actually taking the time and massive concurrency reduction to check uniqueness. John On 04/06/2014 18:01, Ron Hitchens wrote: > > I'm working on a project, one aspect of whi

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
Rob, I believe there is a race condition here. A document may not exit as-of the timestamp when this request starts running, but some other request could create one while it's running. This request would then over-write that document. I'm actually more concerned about element values in

Re: [MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Whitby, Rob
How about something like this? declare function unique-uri() { let $uri := "/doc/" || xdmp:random() || ".xml" return if (fn:not(fn:doc-available($uri))) then $uri else unique-uri() }; I guess because indexes are distributed across forests, ensuring uniqueness is not that easy? Rob _

[MarkLogic Dev General] New Feature Request: Unique Value Range Indexes

2014-06-04 Thread Ron Hitchens
I'm working on a project, one aspect of which requires minting unique IDs and assuring that no two documents with the same ID wind up in the database. I know how to accomplish this using locks (I'm pretty sure) but any such implementation is awkward and prone to subtle edge case errors, and

[MarkLogic Dev General] unsubscribe

2014-06-04 Thread Shaun Venus
From: himanshu kapsime Sent: ‎04/‎06/‎2014 13:52 To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] unsubscribe ___ General mailing lis

[MarkLogic Dev General] unsubscribe

2014-06-04 Thread himanshu kapsime
___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general

[MarkLogic Dev General] unsubscribe

2014-06-04 Thread Greg Suprock
___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] S3 and xdmp:external-binary

2014-06-04 Thread David Ennis
HI Geert. Thank You. I'll be on the AWS server this week and will have a look at getting enough information to clarify if there is reason to open a ticket. Likely there is reason to open such a ticket because the error-messages in some cases refer to the bucket all in lowercase when the S3 reque

[MarkLogic Dev General] unsubscribe

2014-06-04 Thread Analyze That | Johan van den Brink
Met vriendelijke groet, Johan van den Brink Consultant Analyze That - Analytics | Data Integration | Reporting | Process Mining Kerkewijk 8 3901 EG Veenendaal T: (06) 49 92 30 30 T: (0318) 52 55 87 M: jo...@analyzethat.nl W: www.analyzethat.nl

Re: [MarkLogic Dev General] best practices for manual directory creation

2014-06-04 Thread Geert Josten
Hi Mike, The automatic dir creation will cause MarkLogic to have to check for dir existance for each doc, for every parent directory of that doc. That certainly slows down your system. Running a separate dir creation process before the ingest, with just a dir creation of each dir yet missing, w

Re: [MarkLogic Dev General] S3 and xdmp:external-binary

2014-06-04 Thread Geert Josten
Hi David, I'd expect MarkLogic and S3 to either ignore the case, or be able to handle it. I can't really comment on the third slash, but it is pretty common on the file protocol. I recommend filing a bug if you haven't done so already. That will give our engineers something to look at..