Hi Rachit, I think there are only very few options, and the best bet is going to be string manipulations. I also tried this slightly more direct approach:
let $xml := "<p><???†?></p>" return xdmp:unquote($xml, (), "repair-full") But that fails with just the same message. I’d use a regex to fix such PI’s to have a name that is valid. Something like fn:replace($xml, "<??([^>]*)>", "<?bad ?$1>") If you prefer to keep bad data out of your database, try to run xdmp:tidy or xdmp:unquote when ingesting data, if necessary in a pre-commit trigger. If that throws an exception, your data won’t get inserted, keeping your database clean.. Kind regards, Geert From: <[email protected]<mailto:[email protected]>> on behalf of Rachit Rampal <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Tuesday, January 12, 2016 at 5:43 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [MarkLogic Dev General] Problem while using xdmp:tidy Hi, Background I have a piece of content which is neither a valid HTML or XML in my legacy database. Considering the fact, it would be difficult to clean the legacy, I want to tidy this up in MarkLogic(version 8.0-3) using xdmp:tidy. The content looks like : [cid:[email protected]] Please find the attached query I’m executing on ML QConsole to tidy this up. Problem The problem here is that the response I’m getting after applying tidy functionality is not a valid XML(verified it via XML validator). Also when I try to insert document with the resulted xml body via POSTMAN or RESTClient, it throws an error saying ‘MALFORMED BODY | Invalid Processing Instruction names’. Response XML : [cid:[email protected]] Expectations My expectation is, that the Marklogic Tidy functionality should rather refrain to tidy-up this type of content and throw an error, which it does not do in the current scenario. If I get the error from the Marklogic Tidy itself, I will rather get this dirty or bad data removed from the legacy database. Please help me to get through this problem or suggest me workaround to get this resolved. Things Tried So Far I have tried various options listed out in xdmp:tidy but it didn’t help me much. Also I investigated on the Processing Instructions but couldn’t find a way through as it doesn’t looks like a valid PI either Kind Regards, Rachit Rampal
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
