Re: [MarkLogic Dev General] MarkLogic vs CouchDB

2016-08-15 Thread Pete Aven
I would start with: What's your use case sir?  Or what is your evaluation 
criteria? A technical comparison may be meaningless depending on what you're 
trying to do.

Pete


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
abhishek.srivas...@cognizant.com
Sent: Friday, August 12, 2016 6:55 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] MarkLogic vs CouchDB

Hi All,

Is there any comparison report between MarkLogic and CouchDB. I got something 
over web but seems to be an old one.

Thanks
Abhishek
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] FW: Base content schema change

2016-02-11 Thread Pete Aven
Hi Raja,

On first glance this sounds somewhat reasonable.  The format you've proposed 
can be very effective. Your proposal seems to be in line with the envelope 
pattern, though I'm not 
sure why not all properties would be put in the doc.  If you move all the 
metadata over then you can remove the additional fragments in the system 
created for the properties docs, which can be useful.

150 properties may or may not be a lot, depending on the amount of content in 
those properties.

But the model will be informed by the content profile (shape/size of data) and 
application profile (how is data used; queries, updates, etc.)

Have you tested your proposal?  Without knowing anything about your 
applications or content, some questions I would have are:


* What's the average size of a doc after merging the content and 
metadata?

o   Will more metadata or content be added over time to existing docs?

o   If so, how large will the docs grow too?

* How is this data used?

o   What types of queries?

* How often is document content updated?

* How often is metadata updated?

* Are there any requirements to track changes to metadata or content?

o   If so, what does that look like?  Additional properties?  Do you need to 
store the previous version?

* Do any of your searches/queries require the metadata and content to 
be in the same document?

o   If not, Can the metadata and content be stored in separate documents?

Depending on answers to the above, you may want to store the metadata in a 
separate, but related document.  The doc could be related either by adding 
another attribute to both the source document and metadata (either an element 
within the doc, or using a collection), or by using Semantic triples.  
Collections or triples may make more sense if you have multiple metadata docs 
per source doc, and/or if you're relating other documents as well as metadata 
to the source document.

But if after moving the metadata over, the docs are of reasonable size and the 
content and metadata is rarely updated, and these docs are not related to 
anything else, then I'd say give it a go and test it. :)

Hope this helps,
Pete



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
rajamani.marimu...@cognizant.com
Sent: Thursday, February 11, 2016 2:35 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] FW: Base content schema change

Hi Team,

Any suggestion really a helpful for us.

By
Raja >>>

From: Marimuthu, Rajamani (Cognizant)
Sent: Tuesday, February 09, 2016 3:27 PM
To: general@developer.marklogic.com
Subject: Base content schema change

Hi Team,

  We have one unique base document schema change plan. I will explain the 
existing scenario and an expected scenario . Kindly give some feedback about 
the changes .


1.   We have document content in one xml file and all properties [meta 
data] in contents properties[~100 to 150 elements].

2.   We have plan to merge into single XML file [content and relevant 
properties] and default properties only will present in content properties 
[like last-modified].

3.   So we have the plan to maintain following XML structure post merge of 
the content as well as properties

   
  


The above one is sample structure , the actual content element names will vary 
with correct meaning

4.   How effective the above structure . Kindly need your feedback to go 
further in this change .


Thanks and regards
Raja >>>


This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Converting SQL Server db to Marklogic ecosystem

2015-09-03 Thread Pete Aven
Hi Kris,

When we bring over documents (rows)  from a relational system, we typically put 
the rows into a collection so we know something about their source and  that 
they have some set relationship we wish to maintain.

http://docs.marklogic.com/guide/search-dev/collections

As for joining, before you go there, you're likely going to denormalize the 
tables.

If you haven't seen it yet, MarkLogic has a free, on-demand course for data 
modeling:

XML and JSON Data Modeling Best Practices : http://mlu.marklogic.com/ondemand/

I also find Damon's presentation on moving from XML to document oriented models 
very helpful:

http://www.marklogic.com/resources/slides-moving-from-relational-modeling-to-xml-and-marklogic-data-models/

Denormalizing can be done in MarkLogic using a simple transformation module and 
a utility such as CORB.

http://developer.marklogic.com/code/corb

After denormalizing, you may find your join key is no longer required.  As for 
joins across the new objects that you create, the keys may present themselves 
naturally in the data, or then you have other options for how to perform joins 
in MarkLogic.

Hope this helps,
Pete





From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Rajendran, Manju
Sent: Thursday, September 03, 2015 8:50 AM
To: general@developer.marklogic.com
Subject: Re: [MarkLogic Dev General] Converting SQL Server db to Marklogic 
ecosystem

Hi  Kris

Iam from RDBMS world to the Marklogic world, :)

There is  collection ( fn:collection()) that  you can group the  documents 
under one collection called as country.
I assume the XML /  json documents also will have the country id or  country 
element that you can link and lookup values.

Thanks
Manju


From: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Cocoram007
Sent: Thursday, September 03, 2015 8:27 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Converting SQL Server db to Marklogic ecosystem

Hi,

I am just trying to understand the scenario of converting an existing SQL 
Server relational Db to Marklogic ecosystem.

In SQL, we have tables and relationships between tables

In Marklogic, we handle rows as documents, and we have one document for each 
row of the database table, Is it possible to establish a relationship between a 
group of documents (say a directory) to another directory(another group of 
documents)?

Say, in detail,

We have a Region table and daily Sales transaction table linked by 
Sales.country = Region.Country

In Marklogic, we have Region Directory (Similar to SQL Region Table) and Sales 
Directory(Similar to Sales Transaction Table). How do we link these two 
directories on country?

Can someone please shed some light on this?

Thanks
Kris
This e-mail may contain confidential or privileged information. If you think 
you have received this e-mail in error, please advise the sender by reply 
e-mail and then delete this e-mail immediately. Thank you. Aetna
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Converting MS Office documents

2015-03-27 Thread Pete Aven
That should work.  I just tried on 8.0-1.1 on Windows and got the expected 
results.

If you're using CPF.  Then you want to confirm you have the following pipelines 
enabled:

Status Change Handling
Office OpenXML Extract

For Office 2007 and greater (docs ending with a .docx, .pptx. .xlsx extension) 
the file format is XML, and so you can unzip the contents and work with the 
native OpenXML Format directly once you've extracted the contents using  the 
Office OpenXML Extract pipeline.

Once inserted, the original doc will be saved in MarkLogic as:
/myDoc/UtilizationReport_xlsx  //the original doc

Once this original doc processed by Office OpenXML Extract, you should see the 
extracted parts in MarkLogic as well :
/myDoc/UtilizationReport_xlsx_parts   //with a bunch of .xml here in 
SpreadsheetML format

The cpf state on the .xlsx will be:  http://marklogic.com/states/extracted

If you already have those 2 pipelines enabled, you may want to disable others 
to see if you can get the expected results to insure no pipelines are 
conflicting with each other in their attempt to process the document.

Hope this helps,
Pete



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Javier Lizarraga
Sent: Thursday, March 26, 2015 7:51 PM
To: General@developer.marklogic.com
Subject: [MarkLogic Dev General] Converting MS Office documents

Hello Developers,

I want to load an MS excel file with filename.xlsx into a MarkLogic database 
(using ML8).  I want to be able to access the contents of the MS excel document.
I enabled the triggers for the database and installed  and enabled the Content 
Processing.  I followed the ML document below:
http://docs.marklogic.com/guide/cpf/default#http://docs.marklogic.com/guide/cpf/default

Loaded:
declareUpdate();
xdmp.documentLoad(C:\\Users\\jlizarraga\\Documents\\UtilizationReport.xlsx,
{
  uri : /myDoc/UtilizationReport.xlsx,
  permissions : xdmp.defaultPermissions()
})

When I load my UtilizationReport.xlsx file I can see the associated properties 
in Query Console:
?xml version=1.0 encoding=UTF-8?
prop:properties xmlns:prop=http://marklogic.com/xdmp/property;
  cpf:processing-status 
xmlns:cpf=http://marklogic.com/cpf;done/cpf:processing-status
  cpf:property-hash 
xmlns:cpf=http://marklogic.com/cpf;d41d8cd98f00b204e9800998ecf8427e/cpf:property-hash
  cpf:last-updated 
xmlns:cpf=http://marklogic.com/cpf;2015-03-26T16:24:16-07:00/cpf:last-updated
  cpf:state 
xmlns:cpf=http://marklogic.com/cpf;http://marklogic.com/states/converted/cpf:statehttp://marklogic.com/states/converted%3c/cpf:state
  cpf:self 
xmlns:cpf=http://marklogic.com/cpf;/myDoc/UtilizationReport.xlsx/cpf:self
/prop:properties

It appears to me that it was successful but I do not see any other associated 
documents besides the UtilizationReport.xlsx file reference.

I was expecting to see:
UtilizationReport.xlsx  (Original Document)
UtilizationReport_xlsx.xml
UtilizationReport_xlsx.xhtml
A Directory called UtilizationReport_xlsx_Parts

I don't see any errors.  Any help would be greatly appreciated.

Thanks,

Javier
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Defining Collections in Marklogic

2015-01-25 Thread Pete Aven
Hi Shashi,

For working with Collection metadata using the Java API, take a look at

http://docs.marklogic.com/guide/java/document-operations#id_89074

For future reference though, you can set the document collections on load into 
MarkLogic in Info Studio by clicking the “Document Settings” button under the 
“Destination Database” in the Load tab and then navigating to the “Collections” 
tab to name the collection.

But better yet, if you’re using Java, you can just use Content Pump to load 
information and set the collections on load using the –output_collections or 
–filename_as_collection arguments.

http://docs.marklogic.com/guide/ingestion/content-pump#id_87699

Hope this helps,
Pete



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Shashidhar Rao
Sent: Sunday, January 25, 2015 4:53 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Defining Collections in Marklogic

Hi.
I have already loaded a bunch of xml files in marklogic database. I used 
Information studio to load these data.

Now I want these loaded data to belong to a certain collection.
Please help me in defining a collection for these data.
How to make these data belong to a particular collection. Is there a way I can 
set in the database localhost:8001 or using java api
Regards
Shashi
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Marklogic times out in insertion of 20000 documents for Scenario2

2014-11-27 Thread Pete Aven
Hi Rahul,

When you say “Install CPF”, there’s potentially a lot there.

When you installed Content Processing for your database, did you enable all the 
pipelines by default?  (this is the “enable conversion” as true option ) If so, 
then if you insert the docs into the CPF domain (which is most likely defined 
as “/”, and you appear to be inserting all docs prefixed with “/xml#”),  then 
all those pipelines will do a condition check potentially and a following 
possible action on the inserted docs. So there will be overhead.

A couple of other potential tests:


1)  Disable all the pipelines for the domain except “status change handling”

a.   Never disable “status change handling”, unless you really know what 
you’re doing in CPF.  You can probably disable this too, but if you do go back 
and enable any of the pipelines, you’ll want to re-enable this as well.

2)  Insert the docs WITHOUT the  “/” prefix in the URI. So CPF should not 
be triggered at all when the docs are inserted.

a.   CPF required the URI start with at least a leading “/” to trigger any 
domain you define for CPF. So inserting as xml#.xml (instead of /xml#.xml) , 
won’t insert it into any CPF domain for processing. Overhead should be minimal.

Maybe try those  see what performance you get.

But I’m not sure what you’re trying to accomplish here.  You don’t enable CPF 
unless you really need it for something.  And we don’t necessarily recommend 
inserting 20,000 docs as a single transaction.  I understand these are 
synthetic docs for testing, but you’d likely use Content Pump which has flags 
that allow you to do things such as set the transaction size for insert.

Hope this is useful,
Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of David Ennis
Sent: Thursday, November 27, 2014 8:42 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Marklogic times out in insertion of 2 
documents for Scenario2

HI.

Regarding number 4 in scenario 2: It is my understanding that installing CPF - 
regardless of what pipelines are configured - causes overhead. If nothing else, 
there are about 6 triggers installed (which a subset get run on the insert).

Kind Regards,
David Ennis



Kind Regards,
David Ennis


David Ennis
Content Engineer

[Description: Image removed by sender. HintTech] http://www.hinttech.com/
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[Description: Image removed by sender. 
http://www.hinttech.com]http://www.hinttech.com/ [Description: Image removed 
by sender.] https://twitter.com/HintTech  [Description: Image removed by 
sender.] http://www.facebook.com/HintTech  [Description: Image removed by 
sender.] http://www.linkedin.com/company/HintTech

On 27 November 2014 at 12:28, Rahul Gupta 
rahul.gu...@nagarro.commailto:rahul.gu...@nagarro.com wrote:
Can you please let me know why Marklogic times out in mentioned Scenario2 
whereas it quickly performs Scenario1?

Scenario1:

1)  Create a new database.

2)  Insert 2 documents in this database through QConsole using the 
following code.
for $i in (1 to 2)
let $uri := fn:concat(“/xml”, $i, “.xml”)
let $document := element{fn:concat(“cpf_”, $i)} {$i}
return
xdmp:document-insert($uri, $document, xdmp:default-permissions(), “collections”)

3)  It takes time 4-9 seconds on ML 7.0-4.1 for DUAl CORE Processor with 1 
forest attached only.

Scenario2:

1)  Install Cpf over this database and don’t mention any action on initial 
state. Rather give some action on any user-defined state.

2)  Run the same code again.

3)  All the documents inserted will go to initial state without any 
invoking of any action.

4)  My understanding says installing cpf without any action being performed 
on initial state should give us same performance as Scenario 1 which is not the 
case.

5)  It is very long time taking query which even times out. Tested with 
12000 and it takes 19 minutes.


Thanks,
Rahul Gupta


___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Is there a way to extract worksheet metadata from an Excel 97/2003?

2014-10-20 Thread Pete Aven
In case it's useful, Microsoft also provides the Microsoft Office Planning 
Manager and Office Compatibility Packs free, which will allow you to bulk 
convert older Office formats ( 2003 and earlier ) to the new OOXML formats.

https://www.microsoft.com/en-us/download/details.aspx?id=21888#filelist
https://www.microsoft.com/en-us/download/details.aspx?id=3

Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Russo
Sent: Monday, October 20, 2014 10:44 AM
To: 'MarkLogic Developer Discussion'
Subject: Re: [MarkLogic Dev General] Is there a way to extract worksheet 
metadata from an Excel 97/2003?

Hello Ron,

Yes, it is feasible to do the metadata extraction upstream of MarkLogic.

It complicates things a little bit but it will be ok.

Apache Tika looks like a nice solution.

My client is a Microsoft shop and they use a product called Aspose to 
convert/extract data from spreadsheets.

The majority of spreadsheet formats that I need to ingest use the older 97/2003 
format. I can use the Aspose API to covert the older format to OOXML on the fly.

It's unfortunate that the MarkLogic xdmp:document-filter() API is not able to 
extract the defined name metadata from the 97/2003 file format.

I consider it to be a bug in the MarkLogic API because other Excel Spreadsheet 
extraction APIs (e.g., Aspose, Tika, Apache POI) can extract this data from the 
older file format.

Anyway, thanks for the info.


-  Gary R



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Ron Hitchens
Sent: Friday, October 17, 2014 11:52 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Is there a way to extract worksheet 
metadata from an Excel 97/2003?


   If it's feasible to do your metadata extraction upstream of MarkLogic (i.e., 
before insertion) you might take a look at Apache Tika.  It's designed for this 
sort of thing.

   You could also setup it up in a simple web service callable from MarkLogic.  
POST the spreadsheet to it and have it return the metadata in whatever form you 
like.

---
Ron Hitchens {r...@overstory.co.ukmailto:r...@overstory.co.uk}  +44 7879 
358212

On Oct 17, 2014, at 3:35 PM, Gary Russo 
garyru...@hotmail.commailto:garyru...@hotmail.com wrote:

Hello Dennis,

Thanks for the info.

Yes, I tried xdmp:excel-convert() but this does not get the worksheet metadata 
either.

The metadata that I need to retrieve from the older excel format is the Named 
Fields.

Users create them using the Excel Named Box feature as shown here. = 
http://spreadsheets.about.com/od/exceltips/qt/81225namebox.htm

It looks like my only option is to use the Apache POI Java API to extract the 
named fields or use it to convert xls-to-xlsx on-the-fly. 
=https://poi.apache.org/apidocs

I know there's a hidden way to use MarkLogic's underlying JVM.

It would be great if I could use it to call the Apache POI code.

But that's a question for another day.

Thanks again,

Gary Russo


Gary Russo
Enterprise NoSQL Developer
http://garyrusso.wordpress.com
http://twitter.com/garyprusso



From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.commailto:boun...@developer.marklogic.com]
 On Behalf Of David Ennis
Sent: Thursday, October 16, 2014 5:02 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Is there a way to extract worksheet 
metadata from an Excel 97/2003?

HI.

I believe that with the conversion licence, you can do what you want with: 
xdmp:excel-convert

Barring that, you could always run openoffice as a headless server for 
conversion purposes.

Kind Regards,
David Ennis





Kind Regards,
David Ennis


David Ennis
Content Engineer

[HintTech] http://www.hinttech.com/
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[http://www.hinttech.com]http://www.hinttech.com 
[http://www.hinttech.com/signature/Twitter_HintTech.png] 
https://twitter.com/HintTech  
[http://www.hinttech.com/signature/Facebook_HintTech.png] 
http://www.facebook.com/HintTech  
[http://www.hinttech.com/signature/Linkedin_HintTech.png] 
http://www.linkedin.com/company/HintTech

On 16 October 2014 20:00, Gary Russo 
garyru...@hotmail.commailto:garyru...@hotmail.com wrote:
I need to extract worksheet metadata called defined name from Excel 97/2003 
formatted spreadsheets.

The ISYS xdmp:document-filter() API is limiting because it only extracts the 
text.

It does not extract any worksheet metadata.

Does anyone know of a workaround for this?

My only thought is to upload the Excel 97/2003 xls file and then convert it 
on the server to an Excel 2010 xlsx format.

Once it's in an Excel 2010 format, I can easily extract the defined name 
metadata.

This is what it looks like in Excel 2010 files.

  definedNames

Re: [MarkLogic Dev General] Text mime mapping a file with an unusual file extension

2014-09-03 Thread Pete Aven
Either. You can set mimetype implicitly or explicitly in MarkLogic:

Implicitly:
http://docs.marklogic.com/guide/ingestion/formats#id_39990 (See Mimetypes in 
left pane on Admin UI)

Explicitly:
http://docs.marklogic.com/guide/ingestion/formats#id_38692

The pro tip suggested in the docs is to explicitly set the format.

Pete


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim
Sent: Wednesday, September 03, 2014 11:01 AM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] Text mime mapping a file with an unusual file 
extension

Hi Folks,

I am loading a document into MarkLogic that has an unusual file extension which 
for sale of discussion I will call “.oth”.  If I rename the file with a .txt 
extension then I can easily tokenize it as a huge text string, but a 
nonstandard extension it is not treated as text.  I want to associate the file 
as a text file, but what I would like to know is if I can import the file, 
e.g., File.oth using webdav and then set the mime type in my xquery, or if the 
mime type needs to be preconfigured in the database.

Suggestions?

Thank you!

Tim M.

___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Urgent request for assistance (please)

2014-08-01 Thread Pete Aven
And to transform the doc using XQuery, take a look at the typeswitch 
expression: http://docs.marklogic.com/guide/app-dev/typeswitch#id_65827

-pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Joe Bryan
Sent: Friday, August 01, 2014 8:15 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Urgent request for assistance (please)

Hi Paul,

The xdmp:node-* functions only work on documents stored in the database. To 
update a file on disk, you'll need to reconstruct the entire document as you 
want, and then save it using xdmp:save.

Can I ask why you've taken this approach? I think you'll be much better served 
by querying, transforming, and updating documents that are stored in the 
database. There's a much larger API surface available to you, and you'll be 
able to leverage the universal index.

Thanks.

-jb

From: Paul Farrell pauldfarr...@hotmail.commailto:pauldfarr...@hotmail.com
Reply-To: MarkLogic Developer Discussion 
general@developer.marklogic.commailto:general@developer.marklogic.com
Date: Friday, August 1, 2014 at 7:55 PM
To: general@developer.marklogic.commailto:general@developer.marklogic.com 
general@developer.marklogic.commailto:general@developer.marklogic.com
Subject: [MarkLogic Dev General] Urgent request for assistance (please)

Hi,

I am desperately hoping someone out there may be able to help me out with an 
xquery app that I am building.

I have only just started with Marklogic and XQuery. It's going fairly well, 
however I am having a really tough time in modifying the content of one of my 
XML documents. I just cannot seem to get a change to an element to pick up. 
Here's my process (I have had to take things back as basic as I could just to 
try and get it working):

1. In Query console I have one tab open which queries for the contents of one 
XML doc

xquery version 1.0-ml;
declare namespace html = http://www.w3.org/1999/xhtml;;
xdmp:document-get(C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3790.xml)

2. This brings back the document as below

?xml version=1.0 encoding=UTF-8?
document
meta
rm_mimetype
/rm_mimetype
rm_hasattachments

false
/rm_hasattachments
rm_attachmentcount
...
3790
/ppl_id
ppl_name

Victoria Wilson
/ppl_name


 3. I now want to update the ppl_name element using XQuery but it's just not 
happening. Here's the XQuery:
xquery version 1.0-ml;
declare namespace html = http://www.w3.org/1999/xhtml;;


let $docxml :=
xdmp:document-get(C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3065.xml)/document/meta/ppl_name
return
  for $node in $docxml/*
  let $target := 
xdmp:document-get(C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3790.xml)/document/meta/*[fn:name()
 = fn:name($node)]
  return
  xdmp:node-replace($target, $node)

--- I am basically looking to replace the ppl_name element in the target 
(3790) with the ppl_name element from the source (3065).

4. I run the Xquery - it completes without error (making me thing it has 
worked) - return value reads your query returned an empty sequence

5. I then go back to the same tab as I used in step 1 and re-run the XQuery 
used in step 1. The doc (3790) comes back but it STILL has Victoria Wilson as 
the ppl_name


Can anyone please help? Perhaps the change needs committing? I just don't know.

Thanks for reading
Paul

___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] csv load

2014-02-17 Thread Pete Aven
See Importing Content with MarkLogic Content Pump (mlcp):

http://docs.marklogic.com/guide/ingestion/content-pump#id_70366

Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik van der 
Hoeven
Sent: Monday, February 17, 2014 11:00 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] csv load

Gentlemen,

Does any body nows a way to load a csv file into Marklogic Database ?



Met vriendelijke groeten/With kind regards,

Erik van der Hoeven
Consultant Business Intelligence

DIKW CONSULTING BV
Einsteinbaan 12
3439 NJ Nieuwegein
M: 06-43029943
E: erik.van.der.hoe...@dikw.commailto:erik.van.der.hoe...@dikw.com
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] can we save the output of the Xquery in any format like text, excel in the local drive of my machine

2014-01-29 Thread Pete Aven
Yes, it's possible. Look at xdmp:save() and it's examples 
http://docs.marklogic.com/xdmp:save

Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of radh...@dell.com
Sent: Wednesday, January 29, 2014 8:34 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] can we save the output of the Xquery in any 
format like text, excel in the local drive of my machine


Dell - Internal Use - Confidential

Hi Team,

Looking for an option to save the output of the XQuery from the query console 
in any other formats like text, excel in the local drive of my machine. Please 
suggest if this is possible?

Thank You

Regards,
Radha
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Pipeline for all type doc in XML

2013-08-07 Thread Pete Aven
Default Conversion does not operate on Office 2007/2010/2013 documents (docx, 
xlsx, pptx, etc.).  It's usage is for documents of type Office 2003 and earlier.

Your choices are to enable the Office Open XML Extract pipeline, which will 
unzip these documents and ingest their related XML parts as is. Or enable the 
Document Filtering (XHTML) pipeline, which will extract the text from these 
binaries into an XHTML format.

Hope this helps,
Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Singh, Gurbeer
Sent: Wednesday, August 07, 2013 10:46 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Pipeline for all type doc in XML

Found something called as Default Conversion , we are using ML 5

I created Pipeline with attached (HTML Conversion, MS Office Conversion, PDF 
Conversion, Status Change handling)

Observation
PDF -- XHTML
Doc -- XHTML
Docx -- not converting

I want XML for all type of doc, is it not doable in ML 5

~Gurbeer
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Singh, Gurbeer 
(CPT)
Sent: Wednesday, August 07, 2013 12:07 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Pipeline for all type doc in XML


We are using ML pipeline for a long time, for XHTML conversion for doc/x/pdf.
For docx we are using XSLT conversion, most of the time this get fail. Some 
time PDF and normal doc also get fail, if doc contains tracking changes or if 
some wired properties are enable.

I want to get rid of this injection issue, I was wondering if we can re-write 
one new pipeline which will convert all type of doc in XML.

Is there any simple way to enable this functionality.



~Gurbeer



NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.



NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Search string and Search Query

2013-02-07 Thread Pete Aven
 And Query console does not support execution of REST API calls

Actually, you CAN call the REST API through Query Console.  POST example 
follows:

xquery version 1.0-ml;
declare namespace html = http://www.w3.org/1999/xhtml;;

let $doc := document {payload
 my-payload/
  /payload
 }
return  
xdmp:http-post(http://localhost:8003/v1/resources/myresource?rs:searchText=MarkLogic;,
   options xmlns=xdmp:http
authentication
   usernameuser/username
   passwordpassword/password
/authentication
headers
   content-typeapplication/xml/content-type
/headers
   /options,
  text{xdmp:quote($doc)}
)

Hope this helps,
Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Ganesh 
Vaideeswaran
Sent: Thursday, February 07, 2013 11:48 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Search string and Search Query

Sini,

You cannot replay the app builder URL in QC. We send a HTTP POST with 
structured query (formed by JS) being the payload in the POST body. So,  you 
cannot simply replay the app builder URL and replay it via a tool such as cURL.

And Query console does not support execution of REST API calls. However, if you 
know what your query is (structured or unstructured), you can use cURL to test 
the query via REST APIs.

Ganesh Vaideeswaran
Director of Engineering, Application Services
MarkLogic Corporation
ganesh.vaideeswa...@markogic.commailto:ganesh.vaideeswa...@markogic.com
Phone: +1 650 655 2398
www.marklogic.comhttp://www.marklogic.com/

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of sini narayanan
Sent: Thursday, February 07, 2013 6:42 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Search string and Search Query

Hi All,

I have the app bilder URL query string as
http://localhost:8017/?q=f=services*_*Amex

How do I replay this query to produce the same results from query console.

Please advice.

Thanks,
Sini
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Ingest SAS?

2013-01-24 Thread Pete Aven
Do we have any unofficial connector that can ingest SAS files into MarkLogic?

At a client site and would like to know ASAP please.

Thanks,
Pete

Sent from my iPhone
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Ingest SAS?

2013-01-24 Thread Pete Aven
Want to ingest SAS database, not files.

Thanks,
Pete

-Original Message-
From: Pete Aven 
Sent: Thursday, January 24, 2013 10:49 AM
To: MarkLogic Developer Discussion
Subject: Ingest SAS?

Do we have any unofficial connector that can ingest SAS into MarkLogic?

At a client site and would like to know ASAP please.

Thanks,
Pete

Sent from my iPhone
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Ingest SAS?

2013-01-24 Thread Pete Aven
Hi All,

Thanks for the replies all.  To close the loop, here are some ways to export 
data from SAS to a CSV or XML format:

http://support.sas.com/documentation/cdl/en/acpcref/63184/HTML/default/viewer.htm#a003103525.htm
http://www2.sas.com/proceedings/sugi29/119-29.pdf
http://blogs.sas.com/content/sasdummy/2012/04/12/build-your-own-sas-data-set-viewer-using-powershell/

There are other ways to connect to SAS as well.  There was a misunderstanding 
on my part, and so I thought some special connector was required.

Hope this helps,
Pete



-Original Message-
From: Pete Aven 
Sent: Thursday, January 24, 2013 11:20 AM
To: MarkLogic Developer Discussion
Cc: SE
Subject: RE: Ingest SAS?
Importance: High

Want to ingest SAS database, not files.

Thanks,
Pete


___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Semantic Analysis Using XQuery

2012-11-21 Thread Pete Aven
If you have your own dictionary/list of terms, you could use that with 
cts:highlight()

http://docs.marklogic.com/cts:highlight

Also, For a free 3rd party enrichment tool, I  like OpenCalais.

http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions

Identifies Entities, and of more interest, Facts and Events.  The service has 
gotten pretty powerful over time, is free (usage limits 50k/day, 4  reqs/sec;), 
and accessible through a REST API. (xdmp:http-get())

Bonus: MarkLogic Server ships with a sample Calais pipeline.  The code 
associated with the pipeline can be found at:

\MarkLogic\Modules\MarkLogic\samples\calais-enrich

This code will give you a jump start with what you need to start calling the 
service.

There's a ton of info in the Calais response and our sample only identifies 
entities so hack accordingly.

The power of our sample is that it will enrich entities inline (awesome), but 
to do this the code calls the service more than one time per document/node so 
be aware:  with the sample code, 50k Calais API calls != 50k documents 
processed.

Hope this helps,
Pete


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Abhishek53 S
Sent: Wednesday, November 21, 2012 10:42 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Semantic Analysis Using XQuery

Hi All,

As per my understanding semantic analysis of content is only possible using 
third party enrichment engine like TEMIS LUXID (already have inbuilt pipeline 
in Marklogic)

Can we built some xQuery based capabily [Any existing API/Research 
Paper/Algorithm] to provide same functionality

eg.

St. Paul means Saint Paul

Paul st. means Paul street


Any suggestion will be highly appreciated!!!
Thanks
Abhishek Srivastav
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: abhishek5...@tcs.commailto:abhishek5...@tcs.com
Website: http://www.tcs.comhttp://www.tcs.com/

Experience certainty. IT Services
Business Solutions
Outsourcing



=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Excel sheets- compatibility issue 2007 Vs 2010

2012-09-28 Thread Pete Aven
For spreadsheet-ml-support.xqy you'll need to upgrade to version 2.0-1, 
available on the developer site.

The latest release includes support for Office 2010 in addition to 2007.

Pete

Sent from my iPhone

On Sep 28, 2012, at 9:30 AM, 
sunil.chenga...@cognizant.commailto:sunil.chenga...@cognizant.com 
sunil.chenga...@cognizant.commailto:sunil.chenga...@cognizant.com wrote:

Hi All,

We are generating excel sheets using Excel libraries provided by MarkLogic 
“spreadsheet-ml-support.xqy” (version 1.0.2) and “excel-lib.xqy”). These were 
perfectly working fine with Excel 2007. But when these are being opened with 
excel 2010, following are the inconsistent behaviors.

1. Sheet is opened with a message Excel found unreadable content in 
file-name. Do you want to recover contents of this workbook. Upon clicking 
yes, its recovering content and displaying clearly with a second informative 
dialogue saying Excel completed file level validation and repair. Some parts 
of this workbook may have been repaired or discarded.

2. Sometimes it is saying that Workbook cannot be opened because it is 
corrupted.

I have gone through different forums on this 2007 vs 2010 version 
compatibilites for content. None of them worked.
Can someone let me know the solution for fixing this issue if you have 
encountered such issue anytime.

Thanks
Sunil C
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.
___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Document Processing

2012-08-09 Thread Pete Aven
Hi Tim,

Wrt 1:

For loading you have a few options:

You can use xdmp:document-load(), or setup a WebDAV server in MarkLogic, 
configure a client, and just save your docs through WebDAV, or use Information 
Studio to load the .docx from the filesystem.

For Office 2007/2010 insure the Office OpenXML Extract pipeline is enabled in 
MarkLogic.  This will unzip the associated parts for each Office doc and place 
them in a sibling folder to the source doc, similar to conversion.

To download you can do a few things as well.  For the source .docx requested 
through a browser:

xquery version 1.0-ml;
declare namespace html = http://www.w3.org/1999/xhtml;;
let $filename :=  File1.docx
let $disposition := fn:concat(attachment; filename=,$filename,)
let $x := xdmp:add-response-header(Content-Disposition, $disposition)
let $x:= 
xdmp:set-response-content-type(application/vnd.openxmlformats-officedocument.wordprocessingml.document)
return   fn:doc(fn:concat(/,$filename))

Or zip up the extracted parts on demand and save to the filesystem:

xquery version 1.0-ml;

let $directory := /MySpreadsheet1_xlsx_parts/
let $uris := cts:uris(,document,cts:directory-query($directory,infinity))
let $parts := for $i in $uris let $x := fn:doc($i) return  $x

let $manifest := parts xmlns=xdmp:zip
 {
  for $i in $uris
  let $dir := fn:substring-after($i,$directory)
  let $part :=  part{$dir}/part
  return $part
  }
 /parts
let $xlsx := xdmp:zip-create($manifest, $parts)
return xdmp:save(C:\Users\me\Desktop\ExcelChartSample.xlsx,$xlsx)

Or you can do some combination of the above, or just drag the source out of 
your WebDAV client, or...

Wrt 2:

Office 2003 and earlier Office docs are not natively XML.  For these you'll 
need to enable the default conversion option.

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/xml/cpf/default.xmlquery=default+conversion

You can return similar to 1 above(different response content type may be 
required, I forget at the moment) or with xdmp:save(), or from WebDAV,  but 
there are no extracted .zip parts to zip up as the formats generated are not 
native XML formats for Office.

Wrt 3:

Default conversion will convert PDF and Office 2003 and earlier docs to XHTML 
and DocBook Lite.  You could then write your own transform to a Office 2007 
format.  That's where the Office Toolkit for Word may be useful.

But note, the default conversion option does not work for Office 2007/2010.  
Those formats are worked with in their native XML formats. There's currently no 
conversion option to generate XHTML or DocBook for one of these 2 formats.

Wrt 4:

Yes. 
http://docs.marklogic.com/4.2doc/docapp.xqy#display.xqy?fname=http://pubs/4.2doc/xml/cpf/default.xmlquery=default+conversion

Hope this helps,
Pete


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim Meagher
Sent: Thursday, August 09, 2012 6:01 AM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] Word Document Processing

Hi Folks,

I'm new to the idea of storing, converting, and extracting Microsoft Word 
documents in and from MarkLogic and I have a couple of questions:


1.   How does one go about storing a Microsoft Word 2007/2010 docx document 
in MarkLogic and then downloading it?  It seems to me that this is pretty 
straight-forward, but I'm wondering if there are any catches.




2.   How do I do the same for Microsoft Word 97-2003 doc docum

ents?



3.   I have reviewed the marklogic-document-support PDF for ML 5 which 
includes information about the Conversion option.  Do I understand correctly 
that with the Conversion option I should be able to load any Mac or Microsoft 
Word document into MarkLogic, convert it into a common XHTML format which can 
be parsed (and edited), and further convert it into a desired version (e.g., 
Microsoft Word 2007 docx) for download?



4.   Is the Conversion option also available for ML 4.2 and if so, where 
would I get the marklogic-document-support PDF for that?


Thanks for the help!

Tim Meagher

___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Document Processing

2012-08-09 Thread Pete Aven
Hi Tim,

Yes, for 4.2 and 5.0, Default Conversion is not actually enabled by default, 
and requires additional licensing.

If you wish to try Conversion out for evaluation purposes, you can use the 
Express license.  I believe it's enabled there, but if not, you can contact 
MarkLogic at m...@marklogic.commailto:m...@marklogic.com and we can possibly 
enable it for you for evaluation to see if it meets your requirements.

The Content Processing Framework IS available, so the pipeline I mentioned for 
Office 2007/2010 will still work without the additional conversion requirement. 
 But for Office 2003 and earlier as well as PDFs, you require the additional 
licensing for default conversion.

Hope this helps,
Pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim Meagher
Sent: Thursday, August 09, 2012 10:52 AM
To: 'MarkLogic Developer Discussion'
Subject: Re: [MarkLogic Dev General] Word Document Processing

Hi Pete,

Thanks for the response - it has been very helpful.

Regarding enabling the default conversion option, does that does require a 
separate license for 4.2 and 5.0+?

Tim

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Pete Aven
Sent: Thursday, August 09, 2012 10:30 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word Document Processing

Hi Tim,

Wrt 1:

For loading you have a few options:

You can use xdmp:document-load(), or setup a WebDAV server in MarkLogic, 
configure a client, and just save your docs through WebDAV, or use Information 
Studio to load the .docx from the filesystem.

For Office 2007/2010 insure the Office OpenXML Extract pipeline is enabled in 
MarkLogic.  This will unzip the associated parts for each Office doc and place 
them in a sibling folder to the source doc, similar to conversion.

To download you can do a few things as well.  For the source .docx requested 
through a browser:

xquery version 1.0-ml;
declare namespace html = http://www.w3.org/1999/xhtml;;
let $filename :=  File1.docx
let $disposition := fn:concat(attachment; filename=,$filename,)
let $x := xdmp:add-response-header(Content-Disposition, $disposition)
let $x:= 
xdmp:set-response-content-type(application/vnd.openxmlformats-officedocument.wordprocessingml.document)
return   fn:doc(fn:concat(/,$filename))

Or zip up the extracted parts on demand and save to the filesystem:

xquery version 1.0-ml;

let $directory := /MySpreadsheet1_xlsx_parts/
let $uris := cts:uris(,document,cts:directory-query($directory,infinity))
let $parts := for $i in $uris let $x := fn:doc($i) return  $x

let $manifest := parts xmlns=xdmp:zip
 {
  for $i in $uris
  let $dir := fn:substring-after($i,$directory)
  let $part :=  part{$dir}/part
  return $part
  }
 /parts
let $xlsx := xdmp:zip-create($manifest, $parts)
return xdmp:save(C:\Users\me\Desktop\ExcelChartSample.xlsx,$xlsx)

Or you can do some combination of the above, or just drag the source out of 
your WebDAV client, or...

Wrt 2:

Office 2003 and earlier Office docs are not natively XML.  For these you'll 
need to enable the default conversion option.

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/xml/cpf/default.xmlquery=default+conversion

You can return similar to 1 above(different response content type may be 
required, I forget at the moment) or with xdmp:save(), or from WebDAV,  but 
there are no extracted .zip parts to zip up as the formats generated are not 
native XML formats for Office.

Wrt 3:

Default conversion will convert PDF and Office 2003 and earlier docs to XHTML 
and DocBook Lite.  You could then write your own transform to a Office 2007 
format.  That's where the Office Toolkit for Word may be useful.

But note, the default conversion option does not work for Office 2007/2010.  
Those formats are worked with in their native XML formats. There's currently no 
conversion option to generate XHTML or DocBook for one of these 2 formats.

Wrt 4:

Yes. 
http://docs.marklogic.com/4.2doc/docapp.xqy#display.xqy?fname=http://pubs/4.2doc/xml/cpf/default.xmlquery=default+conversion

Hope this helps,
Pete


From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.com]mailto:[mailto:general-boun...@developer.marklogic.com]
 On Behalf Of Tim Meagher
Sent: Thursday, August 09, 2012 6:01 AM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] Word Document Processing

Hi Folks,

I'm new to the idea of storing, converting, and extracting Microsoft Word 
documents in and from MarkLogic and I have a couple of questions:


1.   How does one go about storing a Microsoft Word 2007/2010 docx document

Re: [MarkLogic Dev General] MS Word Doc Creation, Updates, and Transforms

2012-07-26 Thread Pete Aven
Hi Tim,

Assuming you're working with Office 2007/2010, then the XQuery API that comes 
with the Word toolkit may prove useful to you.

http://developer.marklogic.com/code/marklogic-toolkit-for-word

The toolkit comes with a guide as well as xqdocs for the API.

Also, there are a series of blog posts, all referenced at the bottom of this 
one, for working with Word/Office and MarkLogic:

http://developer.marklogic.com/blog/smallchanges/2009-01-22

If you're working with Pre-2007 Word documents, then you'll want to look into 
MarkLogic's document conversion capabilities so you can generate XML from those 
binaries:

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/xml/cpf/default.xml

Hope this helps,
Pete


-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim Meagher
Sent: Thursday, July 26, 2012 1:51 PM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] MS Word Doc Creation, Updates, and Transforms

Hi Folks,

I'm interested in developing a web-based UI for creating, updating, and 
accessing content from MS Word documents via MarkLogic.  I'm new to working 
with Word docs in ML.  Can you all point me to docs, toolkits, examples, etc?

Thank you!

Tim Meagher


___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word add-in and Word 2010

2012-02-24 Thread Pete Aven
Hi Jakob,

I received the .zip.  But can you please send me the source .docx this test.xml 
is generated from?  Something else is going on here so I'd like to test to try 
and recreate your issue.

Also, I'm assuming this is the docx that you've saved to MarkLogic and that you 
are trying to use as a search hit for insert into another document (for testing 
purposes)?

This test.xml document opens in Word in compatibility mode in 2010 (which 
signifies a 2007 doc.)  If you roundtrip a 2010 doc, this shouldn't be the case 
if you're using the latest .xqy; at least in my testing.  Also, simple docs for 
me don't generate footnotes.  The code in word-processing-ml-support.xqy 
accounts for the footnotes.xml part in OPC generation, but, maybe you've 
discovered a bug.

. Inside the /word/document.xml part, contents is simplified, for example, an 
element for signalling a spelling error has been removed, but otherwise it 
looks very much the same

There's the XML Office will consume, and there's the XML it will produce.  We 
aim to keep it simple when working with the formats and provide Office the 
minimum XML for ingest to still get the desired results for the author in the 
active document as well as for the next time the document is saved in Office.

. The function ooxml:get-directory-package in the latest 
word-processing-ml-support.xqy seems to take the different components in the 
order as returned by cts:directory-query which makes me think that order is 
not important.

Order does not matter.

* I did not have the WordprocessingML Process pipeline activated.
However, once activated the insertion still didn't succeed. (The description of 
this pipeline indicates that it's about merging similar runs. I did notice when 
comparing the XML, that w:r elements were merged, so I'd guess that works).

In case you're interested, the pipeline solves this problem: 
http://community.marklogic.com/blog/smallchanges/2007-12-18

It really shouldn't matter if its activated, but it was a guess as to 
potentially what XML might  be getting tripped up during OPC generation without 
knowing what your docs looked like.

Thanks again,
Pete



-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Friday, February 24, 2012 5:58 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word add-in and Word 2010

Pete,

thanks for your replies, I very much appreciate them.

I saved the package XML that is about to be inserted in the current Word 
document via the Developer Tools in IE9 and tried to open it in Word (before 
doing that I added the processing instruction so that Windows launches Word 
instead of my XML editor).  This failed and Word tells me why:

---
The file test.xml cannot be opened because there are problems with the contents.

Details

The XML data is invalid according to the schema.

Location: Part: /word/footnotes.xml, Line: 3, Column: 191
---

It then goes on to suggest to attempt to recover its contents and succeeds.

So it quite clearly says the document is not valid according to its schema. I 
compared a basic OPC file created by Word with the one generated by MarkLogic 
(although one cannot expect them to be same as the generated one should only 
contain the part of a document in the database that has a search hit), and the 
main differences seem to be the order of the pkg:parts. Inside the 
/word/document.xml part, contents is simplified, for example, an element for 
signalling a spelling error has been removed, but otherwise it looks very much 
the same. The function ooxml:get-directory-package in the latest 
word-processing-ml-support.xqy seems to take the different components in the 
order as returned by cts:directory-query which makes me think that order is not 
important. But I don't have a schema handy to validate it.


Regarding the checks:
* documents are saved OK via WebDAV. I can open them directly from Word, and as 
you mentioned hits are found. The extraction pipelines are also executed as the 
_parts directory is created.
* I'm using MarkLogic 5.0-2
* I had installed the latest version of the word-processing-ml-support.xqy in 
Modules/MarkLogic/openxml.
* I did not have the WordprocessingML Process pipeline activated.
However, once activated the insertion still didn't succeed. (The description of 
this pipeline indicates that it's about merging similar runs. I did notice when 
comparing the XML, that w:r elements were merged, so I'd guess that works).

So, in summary, the package XML retrieved from MarkLogic contains the different 
parts in a different order than how Word creates them.
Otherwise I cannot see the differences. For information, I added the XML as a 
zip file to this mail. If it doesn't make it through to the list, I'll send it 
to you off-list.

cheers,
Jakob.



On Thu, Feb 23, 2012 at 19:24, Pete Aven pete.a...@marklogic.com wrote:
 Jakob!

These documents

Re: [MarkLogic Dev General] Word add-in and Word 2010

2012-02-24 Thread Pete Aven
Hi Jakob,

I just sent you 2 files.

From the Authoring App : Author/search/insert.xqy

And from the TK :  
MarkLogic/Modules/MarkLogic/openxml/word-processing-ml-support.xqy

The issue is this: Office 2010 requires namespace declarations for the root 
element of certain documents within the .docx (or OPC) package, even though 
those namespaces might not be used by any elements of the document.

Your test2.docx includes footer#.xml, footnotes.xml, and endnotes.xml parts.  
Those parts were all accounted for in OPC generation, but not the programmatic 
addition of their namespaces that 2010 requires.  We were already doing this 
for some files, but not these 3, so I added it.

Just an observation: Your test2.docx is actually a 2007 document, so I'm 
guessing you're working with an existing template. With the Addin and XQuery 
API, 2007 content can insert into 2010 and vice versa (to some extent on the 
latter, the XQuery API doesn't account for new 2010 features, but most 
organizations aren't going back and forth between flavors, ymmv)

With the updates provided, you should be in business!  I was able to generate 
OPC from the extracted test2.docx, roundtrip components inserting from 
searches, and even open the docs generated for 2010 in 2007.   But if you run 
into any issues, let me know.

 And looking at it with some more care, I notice that it's not actually an 
 XML DOM object as I thought it would be, but a string object with a trim 
 function added (see screenshot). This may be intentional, but just in case 
 ...

XML isn't a first class data type to Microsoft. When we feed Word XML , the 
Word object model actually requires the XML as a string. So, you'll find 
xdmp:quote/xdmp:unquote in the TK API and applications.

 PS: Maybe at the end of it all you tell me why the add-in is called Oslo 
 Information Panel. :)

Oslo's my dog. You'll only ever see that when you run the app in IE. Within the 
Addin it's not visible.  But you can always change/remove it.  Oslo makes his 
way into all my code at some point. :)

Have a good weekend,
Pete



-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Friday, February 24, 2012 1:51 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word add-in and Word 2010

Pete,

I tried to recreate the test scenario like so:

* I created a test2.docx (Blank document template), with a section
content control, a requirement content control and a disclaimer boilerplate. 
Please find it attached.
* I then saved this document (no fancy save as, just save) on my local disk 
under the name test2.docx.
* I then copied it to the WebDAV folder where it got promptly unzipped and 
indexed.
* I then created a new document from scratch (Blank document template).
* In the MarkLogic pane I searched for test (because this word appears on 
both content controls), and got two results.
* I clicked on the Insert button next to a search result and encountered the 
aforementioned error about not being able to insert at this point.

I also attach the extracted file as it appears in MarkLogic, i.e. the _parts 
directory structure, for comparison, to be sure the pipelines did what we asked 
them to.

Mmmh ... I finally also attach what appears to be the xml
property of the pkgxml Javascript object. And looking at it with some more 
care, I notice that it's not actually an XML DOM object as I thought it would 
be, but a string object with a trim function added (see screenshot). This may 
be intentional, but just in case ...


cheers,
Jakob.

PS: Maybe at the end of it all you tell me why the add-in is called Oslo 
Information Panel. :)


On Fri, Feb 24, 2012 at 17:43, Pete Aven pete.a...@marklogic.com wrote:
 Hi Jakob,

 I received the .zip.  But can you please send me the source .docx this 
 test.xml is generated from?  Something else is going on here so I'd like to 
 test to try and recreate your issue.

 Also, I'm assuming this is the docx that you've saved to MarkLogic and that 
 you are trying to use as a search hit for insert into another document (for 
 testing purposes)?

 This test.xml document opens in Word in compatibility mode in 2010 (which 
 signifies a 2007 doc.)  If you roundtrip a 2010 doc, this shouldn't be the 
 case if you're using the latest .xqy; at least in my testing.  Also, simple 
 docs for me don't generate footnotes.  The code in 
 word-processing-ml-support.xqy accounts for the footnotes.xml part in OPC 
 generation, but, maybe you've discovered a bug.

. Inside the /word/document.xml part, contents is simplified, for
example, an element for signalling a spelling error has been removed,
but otherwise it looks very much the same

 There's the XML Office will consume, and there's the XML it will produce.  We 
 aim to keep it simple when working with the formats and provide Office the 
 minimum XML for ingest to still get the desired results

Re: [MarkLogic Dev General] Word add-in and Word 2010

2012-02-23 Thread Pete Aven
Hi Jakob,

Are you trying to insert from the boilerplate tab, from a search hit, or both?

To test boilerplate: save a document as XML from Word. (just as XML, not 2003 
XML), save this to the database, and reference it in the config file found at 
Author/config/boilerplate.xml.

Documents saved as XML are saved in what Microsoft calls OPC format. See 
http://community.marklogic.com/blog/smallchanges/2009-01-08 for more details.

Then restart Word, place your cursor somewhere in the document, goto the 
boilerplate tab in the application, and click the button for the boilerplate 
you just added.

You'll see that the code for boilerplate insert fetches the document from the 
Server and passes it to insertWordOpenXML() which inserts it at the current 
cursor location.  If this works, we're on the right track.

The insert function from the button on a search hit, takes a component found in 
a search ( a component being anything previously enriched from the enrich tab 
in the Authoring application and saved to MarkLogic  ), and uses the XQuery API 
to format it as OPC, before inserting into the doc using the 
insertWordOpenXML() function.

Are you starting with existing docs?  Or docs from SharePoint?  These may have 
XML elements we haven't seen yet that aren't accounted for in the XQuery API 
and so may cause an issue. You may want to start by Authoring new docs to test 
the functionality, then hammer it with your existing docs to break it. :)

Hope this helps,
Pete



-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Thursday, February 23, 2012 8:39 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word add-in and Word 2010

After having installed the Author sample application which is really rather 
cool, the insertion into the current document still doesn't work.  While trying 
to understand, I noticed that the insert functionality expects the URI lexicon 
to be enabled which wasn't mentioned in the documentation but enabling that got 
me one step further.  Now it seems that the items found cannot not be inserted 
anywhere in the current document. By the way, I stuck with the defaults right 
now (i.e. Policies, Sections, Recommendations). That's the error message: ERROR 
error: XML markup cannot be inserted in the specified location. which I was 
able to track down to this function in
MarkLogicWordAddin:

line 1368 MLA.insertWordOpenXML = function(opc_xml)

and more particularly that line:

line 1381 window.external.insertWordOpenXML(v_docx);

Glad for any ideas

cheers,
Jakob.



On Wed, Feb 22, 2012 at 17:59, Jakob Fix jakob@gmail.com wrote:
 Thanks Pete,

 That's extra quick! :)
 I got the zip. and am updating the msi as we speak.

 cheers,
 Jakob.



 On Wed, Feb 22, 2012 at 17:45, Pete Aven pete.a...@marklogic.com wrote:
 Hi Jakob,

1) this add-in is supported for Word 2010

 Though the Addin will install with Office 2010; the XQuery API with the 
 Toolkit that is currently available on the Community site is only compatible 
 with the 2007 flavor of WordprocessingML.

 The TK has been updated for 2010 support and is currently sitting in a 
 repository where I'm told it will be released onto the unsuspecting, Office 
 2010-hungry masses at some point in the future.  Until then, I've sent you a 
 snapshot of the latest TK to your gmail.

2) if so how can one debug this Javascript code (is there a Firebug-like 
tool for this?).

 Unfortunately, not really.  Develop for the Addin application everything you 
 can outside of the context of the Addin (In IE).  You can use IE8 which has 
 developer tools which are similar to firebug.  Once the application is in 
 the Addin however and calling the MLA functions, your only real option is to 
 use alert()s (or write logs to the filesystem, which you can do with 
 JavaScript in IE).

 MLA.insertBlockContent(response.responseXML);

 This function really should be deprecated.  Instead of the simple 
 Sample, I'd suggest using the Sample Authoring App to enrich/insert 
 content, and taking a look at the function MLA.insertWordOpenXML().  
 Once you grok this function, you will keep Word in a headlock and 
 pretty much have your way with it. :)

 Hope this helps,
 Pete

 -Original Message-
 From: general-boun...@developer.marklogic.com 
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob 
 Fix
 Sent: Wednesday, February 22, 2012 11:24 AM
 To: General Mark Logic Developer Discussion
 Subject: [MarkLogic Dev General] Word add-in and Word 2010

 Hello again,

 I've installed successfully the Word add-in and am able to search using the 
 sample provided in the download.

 However, the double-click on a found paragraph does not insert it 
 into the currently open word document. Probably, things have changed 
 from
 2007 to 2010.  Looking at the Javascript code in Samples/search/search.js I 
 find this line

Re: [MarkLogic Dev General] Word add-in and Word 2010

2012-02-23 Thread Pete Aven
Jakob!

These documents are DOCX and were created by me when playing around with the 
tool kit and saved directly to MarkLogic via WebDAV. Now, given that the 
error message is the same as above and it was inappropriate there, I wonder 
what the reason might be here.

The error message usually indicates that Word doesn't like the XML you are 
trying to insert.  So, something maybe wrong with the XML created for insert.  

Things to check:

1) You can validate the documents are indeed saved to ML after using WebDAV 
WebDAV often does not work properly, especially on windows. I'm 
assuming it works as you get search results, but, just in case.

2)  Office Open XML Extract and WordprocessingML Process pipelines are enabled
I know the former is, but the latter?
For WordprocessingML Process, which version of the server are you 
using?  5.0 supports the 2010 format, but previous versions do not.  Let me 
know if you are using an earlier version and I can forward the appropriate 
files (there are 2 .xqy, they're small).

3) Did you copy over the latest version of word-processing-ml-support.xqy that 
I sent you in the .zip to server-root/ Modules/MarkLogic/openxml ?
This latest copy has support for the 2010 flavor of WordprocessingML, 
where the one downloadable from Community does not.

 Interestingly enough, I'm not getting any results for words appearing in the 
 boilerplate documents, are they excluded from the search?

When you enrich a document in Word, it adds what are called 'Content Controls' 
around the selected sections within the Word application.  In the XML, these 
manifest themselves as Structured Document Tags; w:sdt elements.  

Searches are performed against any text found within child elements of w:sdt.  

When you insert, the search hit (the w:sdt from the source document)  is 
formatted using the XQuery API into a Word document in the OPC format. (at 
least, it should be when everything is working correctly.)  This OPC document 
is then inserted into the active document through MLA.insertWordOpenXML().

The boilerplates probably don't have any content within w:sdt tags and 
therefore are not showing up in searches.

You can of course change the search by modifying Author/search/search.xqy, but 
let's not go there til we sort out insert. :)

Hope this helps,
Pete

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Thursday, February 23, 2012 12:15 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word add-in and Word 2010

Hi Pete,

OK, I grok the boilerplate functionality now (I somehow expected the files to 
already exist behind the buttons). The error message about XML markup [that] 
cannot be inserted in the specified location was kind of misleading. But 
that's cool.  I've created a couple of documents with different styles and they 
are maintained on insert, which is what you would expect when you know it's 
actually the OPC XML that's being copied and pasted, but still nice.

We're making progress, thanks a lot. :)

Next up is search: My search finds hits in docx documents right now, and the 
debug alert about the contents of the XML about to be inserted shows OPC XML, 
here's a bit:

PACKAGE XML IS pkg:package
xmlns:pkg=http://schemas.microsoft.com/office/2006/xmlPackage;pkg:part
pkg:name=/word/glossary/fontTable.xml
pkg:contentType=application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xmlpkg:xmlDataw:fonts
mc:Ignorable=w14
xmlns:r=http://schemas.openxmlformats.org/officeDocument/2006/relationships;
xmlns:w14=http://schemas.microsoft.com/office/word/2010/wordml; .

These documents are DOCX and were created by me when playing around with the 
tool kit and saved directly to MarkLogic via WebDAV. Now, given that the error 
message is the same as above and it was inappropriate there, I wonder what the 
reason might be here. Clearly, this is not about the cursor being at the wrong 
position. By the way, the Open button for each search result works fine and 
opens the document as expected.  One of the search results is a Section the 
other one a Policy.

Interestingly enough, I'm not getting any results for words appearing in the 
boilerplate documents, are they excluded from the search?

cheers,
Jakob.



On Thu, Feb 23, 2012 at 14:54, Pete Aven pete.a...@marklogic.com wrote:
 Hi Jakob,

 Are you trying to insert from the boilerplate tab, from a search hit, or both?

 To test boilerplate: save a document as XML from Word. (just as XML, not 2003 
 XML), save this to the database, and reference it in the config file found at 
 Author/config/boilerplate.xml.

 Documents saved as XML are saved in what Microsoft calls OPC format. See 
 http://community.marklogic.com/blog/smallchanges/2009-01-08 for more details.

 Then restart Word, place your cursor somewhere in the document, goto the 
 boilerplate tab in the application

Re: [MarkLogic Dev General] cpf pipeline question

2012-02-22 Thread Pete Aven
Conversion is currently for Office 2003 documents and earlier.

With 2007/2010 we work with the XML directly.  The Office Open XML Extract 
pipeline will unzip the .docx and .pptx, and create the *_parts directory 
containing their XML components.

Hope this helps,
Pete

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Wednesday, February 22, 2012 9:37 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] cpf pipeline question

Hi,

So i'm experimenting with the conversion option in MarkLogic (v5.0).
CPF is installed and enabled, conversion is set to true.
Import of docx and pptx is via WebDAV.

However, conversion visibly doesn't take place.
I set logging to finest, so  I see lots of skipped lines but no outright 
errors:

2012-02-22 15:31:17.416 Fine: TaskServer: Documents: on-any-property skipping 
/AuthoringGuide.docx

Uploaded documents are visible via QC's Explore, their type is binary, and 
the properties don't show any errors, e.g.:
prop:properties xmlns:prop=http://marklogic.com/xdmp/property;
  cpf:processing-status
xmlns:cpf=http://marklogic.com/cpf;done/cpf:processing-status
  cpf:property-hash
xmlns:cpf=http://marklogic.com/cpf;d41d8cd98f00b204e9800998ecf8427e/cpf:property-hash
  cpf:last-updated
xmlns:cpf=http://marklogic.com/cpf;2012-02-22T15:23:04.949+01:00/cpf:last-updated
  cpf:state 
xmlns:cpf=http://marklogic.com/cpf;http://marklogic.com/states/converted/cpf:state
  cpf:self xmlns:cpf=http://marklogic.com/cpf;/AuthoringGuide.docx/cpf:self
  prop:last-modified2012-02-22T15:23:04+01:00/prop:last-modified
/prop:properties

So, no _toc.xml file or _parts directory is created with XML inside.

Could somebody please tell me what else to check?

Thanks,
Jakob.
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word add-in and Word 2010

2012-02-22 Thread Pete Aven
Hi Jakob,

1) this add-in is supported for Word 2010 

Though the Addin will install with Office 2010; the XQuery API with the Toolkit 
that is currently available on the Community site is only compatible with the 
2007 flavor of WordprocessingML.  

The TK has been updated for 2010 support and is currently sitting in a 
repository where I'm told it will be released onto the unsuspecting, Office 
2010-hungry masses at some point in the future.  Until then, I've sent you a 
snapshot of the latest TK to your gmail.

2) if so how can one debug this Javascript code (is there a Firebug-like tool 
for this?).

Unfortunately, not really.  Develop for the Addin application everything you 
can outside of the context of the Addin (In IE).  You can use IE8 which has 
developer tools which are similar to firebug.  Once the application is in the 
Addin however and calling the MLA functions, your only real option is to use 
alert()s (or write logs to the filesystem, which you can do with JavaScript in 
IE).

 MLA.insertBlockContent(response.responseXML);  

This function really should be deprecated.  Instead of the simple Sample, I'd 
suggest using the Sample Authoring App to enrich/insert content, and taking a 
look at the function MLA.insertWordOpenXML().  Once you grok this function, you 
will keep Word in a headlock and pretty much have your way with it. :)

Hope this helps,
Pete

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Jakob Fix
Sent: Wednesday, February 22, 2012 11:24 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Word add-in and Word 2010

Hello again,

I've installed successfully the Word add-in and am able to search using the 
sample provided in the download.

However, the double-click on a found paragraph does not insert it into the 
currently open word document. Probably, things have changed from
2007 to 2010.  Looking at the Javascript code in Samples/search/search.js I 
find this line:

MLA.insertBlockContent(response.responseXML);

which seems to be responsible for the insertion of the paragraph.

So, I guess my question is whether 1) this add-in is supported for Word 2010 
and 2) if so how can one debug this Javascript code (is there a Firebug-like 
tool for this?).

cheers,
Jakob.
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] layout format issue with docx

2011-11-14 Thread Pete Aven
Hi Gurbeer,

.docx to XHTML conversion is not currently provided with MarkLogic Server.   
The conversion pipelines that ship with the Server currently convert Office 
2003 to XHTML, but not 2007/2010.  This leads me to believe you have some 
custom code that you or someone else has written to perform this conversion.

For more help, I think you might want to try and share some of your code and 
what you're trying to do, so others may weigh in more effectively.  And/Or if 
you're not the original author of the conversion code, try and contact who is, 
as they may be able to provide suggestions to you based on their familiarity 
with what has been written.

Hope this is useful,
Pete



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Singh, Gurbeer
Sent: Monday, November 14, 2011 10:40 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] FW: layout  format issue with docx

Guys any suggestion for this.

~Gurbeer

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Singh, Gurbeer 
(Corporate and Banking Technology)
Sent: Friday, November 11, 2011 4:54 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] layout  format issue with docx

Hi

I am facing one issue with docx conversion. It's a format issue. If I am 
submitting docx with certain layout it's not maintained in XHTML


Original docx contains some format , but XHTML generates it as normal text . 
Some time its main ting but some time it's not working, like below is the case

For eg.

Original document contains

Treasurer's Summary

 1.  The Treasurer's Summary is prepared by State Street Fund Administration 
and reviewed by MS Fund Administration for inclusion in the Board materials.  
Items Required are as follows:

*   Net Assets
*   Credit Lines
*   Dividends Declared for both Open-end and Closed-end Funds
*   Affiliated Cash Sweep - Yield Comparison
*   Net Flows
*   Ratio of Expenses
*   Fund Actions for the Quarter
*   NAV Errors for the Quarter



XHTML generated is like this :


Treasurer's Summary

The Treasurer's Summary is prepared by State Street Fund Administration and 
reviewed by MS Fund Administration for inclusion in the Board materials. Items 
Required are as follows:

Net Assets

Credit Lines

Dividends Declared for both Open-end and Closed-end Funds

Affiliated Cash Sweep - Yield Comparison

Net Flows

Ratio of Expenses

Fund Actions for the Quarter

NAV Errors for the Quarter



XHTML generated



html version=-//W3C//DTD XHTML 1.1//EN xmlns=http://www.w3.org/1999/xhtml;
  head
   style id=dynCom type=text/css/style
script type=text/javascript language=JavaScript
function msoCommentShow(anchor_id, com_id)
{
if(msoBrowserCheck())
{
c = document.all(com_id);
a = document.all(anchor_id);
if (null != c amp;amp; null == c.length amp;amp; null != a 
amp;amp; null == a.length)
{
var cw = c.offsetWidth;
var ch = c.offsetHeight;
var aw = a.offsetWidth;
var ah = a.offsetHeight;
var x  = a.offsetLeft;
var y  = a.offsetTop;
var el = a;
while (el.tagName != BODY)
{
el = el.offsetParent;
x = x + el.offsetLeft;
y = y + el.offsetTop;
}
var bw = document.body.clientWidth;
var bh = document.body.clientHeight;
var bsl = document.body.scrollLeft;
var bst = document.body.scrollTop;
if (x + cw + ah / 2 gt; bw + bsl amp;amp; x + aw - ah / 2 - cw 
gt;= bsl )
{ c.style.left = x + aw - ah / 2 - cw; }
else
{ c.style.left = x + ah / 2; }
if (y + ch + ah / 2 gt; bh + bst amp;amp; y + ah / 2 - ch gt;= 
bst )
{ c.style.top = y + ah / 2 - ch; }
else
{ c.style.top = y + ah / 2; }
c.style.visibility = visible;
}}}
function msoCommentHide(com_id)
{
if(msoBrowserCheck())
{
c = document.all(com_id);
if (null != c amp;amp; null == c.length)
{
c.style.visibility = hidden;
c.style.left = -1000;
c.style.top = -1000;
} }
}
function msoBrowserCheck()
{
ms = navigator.appVersion.indexOf(MSIE);
vers = navigator.appVersion.substring(ms + 5, ms + 6);
ie4 = (ms gt; 0) amp;amp; (parseInt(vers) gt;= 4);
return ie4;
}
if (msoBrowserCheck())
{
document.styleSheets.dynCom.addRule(.msocomanchor,background: 
infobackground);
document.styleSheets.dynCom.addRule(.msocomoff,display: none);
document.styleSheets.dynCom.addRule(.msocomtxt,visibility: hidden);
document.styleSheets.dynCom.addRule(.msocomtxt,position: absolute);
document.styleSheets.dynCom.addRule(.msocomtxt,top: -1000);

Re: [MarkLogic Dev General] NOVAMUG Meeting in Reston, VA

2011-08-03 Thread Pete Aven
If you join the meetup group, the location shall be revealed.  :)

http://www.meetup.com/Mark-Logic-User-Group/

Pete Aven
Senior Engineer
MarkLogic Corporation
pete.a...@marklogic.com
Phone: +1 650 655 2300
Cell:  +1 650 504 3115
www.marklogic.com

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.

From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of mcundi...@comcast.net 
[mcundi...@comcast.net]
Sent: Wednesday, August 03, 2011 12:01 PM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] NOVAMUG Meeting in Reston, VA

Can someone point me to directions/map for location of the meetup?

Thanks,
Morgan


From: Nuno Job nunojobpi...@gmail.com
To: General Mark Logic Developer Discussion general@developer.marklogic.com
Sent: Tuesday, July 26, 2011 8:27:09 AM
Subject: [MarkLogic Dev General] NOVAMUG Meeting in Reston, VA

Hi Guys,

If you are interested in the upcoming features of MarkLogic 5.0 you should 
attend the next NOVAMUG user group meeting.

Sign up, RSVP or simply more details at:
http://www.meetup.com/Mark-Logic-User-Group/events/26538451/


Thank you,
Nuno
What's new in MarkLogic Server 5.0?

Wednesday, August 3, 2011, 7:00 PM

SELECTED BY:

This location is shown only to members

Dipti Borkar, Senior Product Manager at MarkLogic, will present a sneak preview 
of the MarkLogic Server 5.0 with a focus on enhanced binary support.  You will 
learn about  what's available in MarkLogic 4.2 for loading, managing, querying, 
and publishing binary files.  W e'll then take a peek into the future and 
describe some enhancements that will  enable you to  store all your content, 
meta data and binary files in MarkLogic to build information-rich applications.



Dipti is a Senior Product Manager at MarkLogic where she oversees  the 
development of the MarkLogic Server and MarkLogic Cloud Offerings. Prior to 
joining MarkLogic, Dipti spent several years at IBM starting as a software 
engineer and most recently as Manager of Development  Continuing Engineering 
for the DB2 server team. Dipti holds a Masters degree in Computer Science from 
the University of California, San Diego with a specialization in databases and 
is an MBA candidate at the  Haas School of Business at University of 
California, Berkeley.

___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Authoring App for Word - Boilerplate Does not Import

2011-04-14 Thread Pete Aven
Hi Lakshmi,

The boilerplate functionality will insert Word documents saved in OPC format 
into the document being authored.

OPC stands for Open Packaging Convention and is essentially the format you get 
when you Save a Word doc as XML.

Here's a blog post on OPC format: 
http://developer.marklogic.com/blog/smallchanges/2009-01-08

To get the required format, Save your Word document as XML.  (Note: Save As 
XML, NOT Word 2003 XML, as its different.)

If you want to create the OPC format programmatically from an extracted .docx, 
there are functions in the Toolkit XQuery API that can help.  The post and 
XQuery functions can help explain how to work with images in Word as well.

-pete

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Lakshmi Narasimhan
Sent: Thursday, April 14, 2011 6:58 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Authoring App for Word - Boilerplate Does not 
Import

I have installed and configured Authoring App for 
Word(r)http://developer.marklogic.com/code/marklogic-sample-authoring. 
Configured registry, authoring.js and config.xqy to the HTTP server.
HTTP server is associated with a database. I ingested a ABC.docx (MS Word 2007) 
file into database as binary with URL /ABC.docx.

I modified boilerplate.xml
config:boilerplate
config:display-labeldisclaimer/config:display-label
config:document-uri/ABC.docx/config:document-uri

I created a new word document, in Mark Logic Add-in boilerplate tab, when I 
click on disclaimer, the tool tip shows /ABC.docx.
When I click on disclaimer button, the text from the word document(stored in 
MLS) is not populated into the new word document.

I extracted the XML out of word:

xdmp:zip-get(doc(/ABC.docx), word/document.xml)

and ingested the xml back in to Mark Logic. Updated boiler plate.xml as below:
config:boilerplate
config:display-labeldisclaimer/config:display-label
config:document-uri/ABC.docx.xml/config:document-uri


I created a new word document, in Mark Logic Add-in boilerplate tab, when I 
click on disclaimer, the tool tip shows /ABC.docx.xml.
When I click on disclaimer button, the text from the word document(stored in 
MLS) is not populated into the new word document.

Is there a different format in which I should be storing office documents to 
include them via boiler plate?
If so how does it work for images. etc

Thanks
Lakshmi




___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Fab database

2011-01-28 Thread Pete Aven
Hi Geert,

Fab stands for Fabrication and this database is used by Information Studio.  In 
an Information Studio flow, when you apply transforms, the transforms will be 
applied to the collected documents in the Fab database before they are inserted 
into the target database. 

Hope this helps,
Pete

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Friday, January 28, 2011 12:37 AM
To: General MarkLogic Developer Discussion (general@developer.marklogic.com)
Subject: [MarkLogic Dev General] Fab database

Hi,

I recently performed a clean installation of MarkLogic Server 4.2. Looking 
around in the Admin interface to see what is being installed initially, I 
noticed the presence of a database named 'Fab'. It seems as if it is 
unconnected. It looks as if it is related to the App-Services. Is it redundant 
or is it used as secundairy docs database by the App-Services?

Kind regards,
Geert
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Sharepoint connector for Marklogic

2010-06-15 Thread Pete Aven
Hi Satish,

The connector will mirror anything in a SharePoint document Libary to 
MarkLogic.  A SharePoint document library could contain XML files, or the files 
could be text, but they're most likely Office documents.  

MarkLogic Server provides a feature we call the Content Processing Framework 
(CPF).  This framework allows us to create pipelines that will transform 
document content as soon as its saved to the Server.

If the mirrored documents are from Office 2007, the Server comes with the 
Office Open XML Extract pipeline, which will automatically unzip the documents 
and save the XML files they are composed of. Also, we provide the MarkLogic 
Toolkits for Word, Excel, and PowerPoint, which each come with pipelines and 
XQuery APIs for simplifying the use of Open XML within the Server.  

If the mirrored documents are Office 2003 flavor, the Server also comes with 
pipelines for transforming these formats to XHTML as well.

You can find more information about the Content Processing Framework and the 
Office Toolkits on our developer site.  But in a nutshell: It may start out as 
a Binary Object in SharePoint, but once it's in MarkLogic, we save it in the 
format we want for search, re-use, and delivery to multiple consumers.  The 
format you save as will be determined by your particular use-case and what 
solution you're ultimately trying to provide to your users/authors/etc.

http://developer.marklogic.com/

Hope this helps,
Pete



From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of Satish 
[garresat...@gmail.com]
Sent: Tuesday, June 15, 2010 8:58 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Sharepoint connector for Marklogic

Hi,

I was interested in Sharepoint Connector for Marklogic, after reading through 
the document, I have couple of questions.
The document says the sharepoint library can be mirrored to Marklogic, so the 
documents are copied to Marklogic from Sharepoint.So I assume they are stored 
as BLOBs inside Marklogic and not converted to XML format.Is this correct? If 
so, How I can have custom search applications on this content which is in the 
form of BLOBs inside Marklogic?How do justify that the Marklogic search on this 
content is better than other search?

Any clarifications on these will be greatly helpful. Also if anyone has already 
implemented Sharepoint connector for Marklogic. request to share your 
experiences on the same.

Thanks,
satish.
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Use MarkLogic server

2010-02-19 Thread Pete Aven
Hi Vishal,

Start with the Getting Started Guide 
http://developer.marklogic.com/pubs/4.1/books/gs.pdf

It introduces the Server and how to query.  It's a brief and very useful guide.

Next check out the Technical Overview:
http://developer.marklogic.com/howto/tutorials/technical-overview.xqy

The Overview provides some direction on loading and querying, as well as other 
aspects of the Server and pointers to other documents and tutorials that can 
help get you started.

If you want to dive right in though, you may find the online tutorials useful 
as well:

http://developer.marklogic.com/howto/tutorials/

Look at Hands-On Understanding Applications, parts 1 and 2:
http://developer.marklogic.com/howto/tutorials/2009-01-get-started-apps.xqy
http://developer.marklogic.com/howto/tutorials/2009-01-get-started-apps-2.xqy

Stepping through these two tutorials, you load XML, write XQuery,  create your 
first applications,  and learn hands-on about different aspects of the Server.  
As you progress, you'll find the rest of the documentation to be very useful.

http://developer.marklogic.com/pubs/4.1/default.xqy

Hope this helps,
Pete




From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of vishal duddu
Sent: Friday, February 19, 2010 2:12 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Use MarkLogic server

I can find documents but I am not able to link those to practice on my local 
server.

I want to know more on practical ways...
I want to practice with the serverload the xml docs and write Xquery to 
retrieve those docs...

Can anyone please help me I have lots of questions where documents are not 
helpful.

waiting for your response.

Vishal
(814)384-1499


On Fri, Feb 19, 2010 at 2:51 PM, Stewart Shelline 
shellin...@ldschurch.orgmailto:shellin...@ldschurch.org wrote:
Vishal:

Have you been to this pagehttp://developer.marklogic.com/pubs/4.1/? This has 
documentation on everything you will need to get started with MarkLogic.

The Developer and Admin guides should be one of your first stops.

SS

From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com]
 On Behalf Of vishal duddu
Sent: Friday, February 19, 2010 12:20 PM
To: general
Subject: [MarkLogic Dev General] Use MarkLogic server

Hi,

I want to know about:
1. How to insert data in to Marklogic
2. How to retrieve that data.
3. Explanation on Mark Logic tool (various tabs like Database, security, 
forest, triggers, schemas etc)
4. Application builders?
5. Hosts?

Waiting for your response.

--
Thanks
Vishal


NOTICE: This email message is for the sole use of the intended recipient(s) and 
may contain confidential and privileged information. Any unauthorized review, 
use, disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message.


___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general



--
Thanks
Vishal Duddu
(814)384-1499
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] is there any Marklogic add-ins for word 2003 application???

2009-09-24 Thread Pete Aven
Hi Mohanraj,

The MarkLogic Add-ins for Office are implemented for Office 2007 to take 
advantage of the new Open XML formats. There are no Add-ins currently available 
for Office 2003.  

It is possible to build an Add-in for Office 2003, you'll want to check the 
MSDN developer library.   This may get  you started: 
http://support.microsoft.com/kb/302901 .

Hope this helps,
Pete


From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of Mohanraj 
[mohan...@laserwords.com]
Sent: Thursday, September 24, 2009 5:28 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] is there any Marklogic add-ins for word
2003 application???

Hi All,

I hope all of you are fine.
I want to know whether there is a Marklogic add-ins for word 2003
application. if not, how to built a add-ins on our own.
kindly guide me with your valuable suggestions.

Thanks  Regards,
Mohanraj D
Software Programmer
Laserwords Private Limited
+91-9445451318
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Problem in unzipping the MS Office Word 2007 document when loaded as binary into MarkLogic server database.

2009-08-18 Thread Pete Aven
Hi Anuj,

   Open XML Extract will unzip Office 2007 documents and insert their 
associated parts.
   WordprocessingML Process merges split runs of text within the document.xml 
piece of an unzipped .docx (Word 2007) package.
   MS Office conversion is for converting Office 2003 documents.

Assuming you've attached the pipelines.  You need to save  your document to the 
domain specified by the pipeline. 
Assuming the Documents database, you can check domain in Admin UI by navigating 
to : Databases - Documents - Content Processing - Domains

the default domain is /. So that means if you have a document, 'foo.docx',  
you need to save it as '/foo.docx' in MarkLogic for the pipeline to work.  

.docx packages are extracted into a folder named for the original package.  So 
when you save /foo.docx, it will be extracted to a directory, 
/foo_docx_parts/.


Hope this helps,
Pete


From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of 
anuj.kum...@cognizant.com [anuj.kum...@cognizant.com]
Sent: Tuesday, August 18, 2009 12:40 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Problem in unzipping the MS Office Word
2007 document when loaded as binary into MarkLogic server database.

Hi All,
I want to load a word 2007 document into MarkLogic server database which in 
turn should unzip the document parts an load it into the MarkLogic server 
database.
I have attached the WordprocessingML 
Processhttp://localhost:8001/cpf-pipeline-admin.xqy?section=databasedatabase=17593846791238263848pipeline=2946334788674590107
 , MS Office 
Conversionhttp://localhost:8001/cpf-pipeline-admin.xqy?section=databasedatabase=17593846791238263848pipeline=13606748502382551960
 and  Office OpenXML 
Extracthttp://localhost:8001/cpf-pipeline-admin.xqy?section=databasedatabase=17593846791238263848pipeline=1471590861047845
  pipelines with my database, but still it does not unzip and load the parts of 
documents into the database. Please help.
Thanks in advance.

Regards,
Anuj Kumar
Cognizant,Kolkata
Vnet:306409

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply 
e-mail and destroy all copies of the original message.
Any unauthorised review, use, disclosure, dissemination, forwarding, printing 
or copying of this email or any action taken in reliance on this e-mail is 
strictly
prohibited and may be unlawful.

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


[MarkLogic Dev General] RE: Mark Logic Connector for Sharepoint - Error

2009-06-29 Thread Pete Aven
Hi Gary,

Some configurations to double-check that if not setup properly may cause this 
error:


1)  Have you copied the .xqy files to MarkLogic and is the XDBC Server 
you've configured have its root directory set to the SharePoint directory 
containing the .xqy files?

2)  Does your password contain any special characters?  Some characters are 
not properly escaped ((^,#,@) this has been fixed and will be available in a 
future release )

3)  Are you a site collection administrator on the SharePoint site?  (Site 
Settings - Users and Permissions - Site Collection Administrators ) You must 
be a site collection admin to be able to configure/use the connector.

Hope this helps,
Pete


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Flancer, Gary 
[Content DSI]
Sent: Monday, June 29, 2009 11:24 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Mark Logic Connector for Sharepoint - Error

I was attempting to install the Mark Logic Connector for Sharepoint.
I followed the instructions in the Administrator's Guide, but I am stuck.

In SharePoint, on the Mark Logic Connector - Administration screen, I get the 
following error when I hit the Configure or Test Connection buttons:

Version: MLConnector.SPCModuleRequest: Object reference not set to an instance 
of an object.

I also tried the mlcadm command on my Sharepoint server using the 'test' 
command and get the same error.  Similarly, in the mlconnector.log file I see 
the same error.

Does anybody know what in general could cause this??  Do the supplied .dll 
files need to be placed somewhere??

Thx,
Gary

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


[MarkLogic Dev General] RE: Excel Toolkit Information

2009-06-18 Thread Pete Aven
Hi Pramit,

The toolkit for Excel consists of 3 components : 1)  an XQuery library for 
server side workbook manipulation/generation 2) an Addin for Excel 2007 that 
manifests itself as a web browser within the Excel application and includes a 
javascript library for interacting with the active Workbook 3) Sample 
applications.

The plugin will not install within Excel 2003 and the XQuery api is written to 
take advantage of the Open XML format for Excel (SpreadsheetML).  While Office 
2003 introduced the concept of XML within its applications, these belong to 
different schemas than the XML found in Office 2007.

There is currently no developer/customization manual available.  The tookit 
includes api docs for the XQuery and javascript functions.  A Toolkit guide is 
also included. 

To help get you started in the code: the javascript api calls functions within 
the Addin.  Within the source for the Addin solution,  you'll find the majority 
of the Addin functions defined within UserControl.cs. The Addin solution is 
written in C# and uses the Visual Studio tools for Office as well as the Excel 
Object model.  (all documented on msdn).  The XQuery api does not require the 
Addin or javascript and includes standard functions as well as wrappers for 
MarkLogic functions which are all documented as well.

It should be pretty easy to figure out what's going on in the code, but if you 
have any questions, please let me know.

Thanks and I hope this helps,
Pete

From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of 
pramit.gh...@cognizant.com [pramit.gh...@cognizant.com]
Sent: Thursday, June 18, 2009 3:54 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Excel Toolkit Information

Team,

This is Pramit from Cognizant. We are ML partner. I need few information
listed below on the new MarkLogic Excel Toolkit.

(1) As per the documentation it is well integrated with Excel 2007.
But does it work with Excel 2003?
(2) Is there any developer/customization manual available for the
same? Because we are looking to add few functionalities like when any
user opens up any document from ML server, we will try to lock the
document, etc.

Appreciate if you can help us to get the above information at the
earliest.

Regards,
Pramit
_
Pramit Ghosh
Program Manager - Consulting | Content  Digital Media
Information, Media  Entertainment | Cognizant Technology Solutions
Mobile: (201) 290-0913 | pramit.gh...@cognizant.com

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s)
and may contain confidential and privileged information.If you are not the 
intended recipient,
please contact the sender by reply e-mail and destroy all copies of the 
original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing 
or copying
of this email or any action taken in reliance on this e-mail is strictly 
prohibited and may be
unlawful.
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


[MarkLogic Dev General] FW: Excel Toolkit Information

2009-06-18 Thread Pete Aven
Hi Pramit,

 Resending.  I hit reply-all but you weren't copied for some reason.

Regards,
Pete

From: Pete Aven
Sent: Thursday, June 18, 2009 4:13 PM
To: General Mark Logic Developer Discussion
Subject: RE: Excel Toolkit Information

Hi Pramit,

The toolkit for Excel consists of 3 components : 1)  an XQuery library for 
server side workbook manipulation/generation 2) an Addin for Excel 2007 that 
manifests itself as a web browser within the Excel application and includes a 
javascript library for interacting with the active Workbook 3) Sample 
applications.

The plugin will not install within Excel 2003 and the XQuery api is written to 
take advantage of the Open XML format for Excel (SpreadsheetML).  While Office 
2003 introduced the concept of XML within its applications, these belong to 
different schemas than the XML found in Office 2007.

There is currently no developer/customization manual available.  The tookit 
includes api docs for the XQuery and javascript functions.  A Toolkit guide is 
also included.

To help get you started in the code: the javascript api calls functions within 
the Addin.  Within the source for the Addin solution,  you'll find the majority 
of the Addin functions defined within UserControl.cs. The Addin solution is 
written in C# and uses the Visual Studio tools for Office as well as the Excel 
Object model.  (all documented on msdn).  The XQuery api does not require the 
Addin or javascript and includes standard functions as well as wrappers for 
MarkLogic functions which are all documented as well.

It should be pretty easy to figure out what's going on in the code, but if you 
have any questions, please let me know.

Thanks and I hope this helps,
Pete

From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] On Behalf Of 
pramit.gh...@cognizant.com [pramit.gh...@cognizant.com]
Sent: Thursday, June 18, 2009 3:54 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Excel Toolkit Information

Team,

This is Pramit from Cognizant. We are ML partner. I need few information
listed below on the new MarkLogic Excel Toolkit.

(1) As per the documentation it is well integrated with Excel 2007.
But does it work with Excel 2003?
(2) Is there any developer/customization manual available for the
same? Because we are looking to add few functionalities like when any
user opens up any document from ML server, we will try to lock the
document, etc.

Appreciate if you can help us to get the above information at the
earliest.

Regards,
Pramit
_
Pramit Ghosh
Program Manager - Consulting | Content  Digital Media
Information, Media  Entertainment | Cognizant Technology Solutions
Mobile: (201) 290-0913 | pramit.gh...@cognizant.com

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s)
and may contain confidential and privileged information.If you are not the 
intended recipient,
please contact the sender by reply e-mail and destroy all copies of the 
original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing 
or copying
of this email or any action taken in reliance on this e-mail is strictly 
prohibited and may be
unlawful.
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Important: Resolution of an Issue in Marklogic

2008-11-14 Thread Pete Aven
Hi Joydeep,

 

There is currently no xhtml conversion for OOXML packages.   You'll have
to write your own transformation.

 

The 'Office OpenXML Extract' pipeline will extract the pieces from a
.docx package into a parts folder.

 

Example:  if you save the document HelloWorld.docx to a domain with
the 'Office OpenXML Extract' pipeline attached, you'll find the binary,
along with all the parts in the parts folder.

HelloWorld.docx

HelloWorld_docx_parts/word/document.xml

HelloWorld_docx_parts/[Content_Typex].xml

HelloWorld_docx_parts/_rels/.rels

HelloWorld_docx_parts/etc. ...   (rest of parts)

 

You can query and display the pieces in CQ. 

 

Styling information for the .docx can be found in the w:rPr,w:pPr
elements in document.xml.  The child elements of these will contain
either direct formatting elements, or references to the styles.xml file
from the package, which contains all style definitions for the document.

 

Hope this helps,

Pete

 

From: Joydeep_Sinha [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 14, 2008 1:31 PM
To: Pete Aven
Cc: Vivek_Nagasundara; general@developer.marklogic.com
Subject: RE: [MarkLogic Dev General] Important: Resolution of an Issue
in Marklogic

 

Hi Pete,

 

Just few more queries, till now the way we used to handle other file
formats in Mark Logic was

* Select any particular format of file.

* Upload it into Mark Logic using xdmp-load query.

* During this process it used to convert the file into xhtml and
xml components along with generation of parts folder for storing
associated images of the content.

* Display it in CQ editor using doc query.

 

Can you please help in co-relating the above process with docx format? I
mean is there any single file like xhtml which can have the styling
information of the ingested docx file or do we need to construct that?

 

Thanks,

Joydeep Sinha

Onsite Co-ordinator  - IDMF PoC

Media and Entertainment - Solution Offerings

Satyam Computer Services Limited.

Mobile - (001)-6103020388

 

 

 

From: Joydeep_Sinha 
Sent: Thursday, November 13, 2008 1:58 PM
To: 'Pete Aven'
Cc: Vivek_Nagasundara
Subject: RE: [MarkLogic Dev General] Important: Resolution of an Issue
inMarklogic

 

Hi Pete,

 

Thanks for your comments. We will surely try this out and revert back in
case we face further implementation challanges.

 

Thanks,

Joydeep Sinha

Media and Entertainment - Solution Offerings

Satyam Computer Services Limited.

Mobile - (001)-6103020388

 

 

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pete Aven
Sent: Wednesday, November 12, 2008 2:36 PM
To: General Mark Logic Developer Discussion
Cc: Vivek_Nagasundara; Thangavelu_Senniyappan
Subject: RE: [MarkLogic Dev General] Important: Resolution of an Issue
inMarklogic

 

Hi Joydeep,

 

In MarkLogic Server 4.0, check out the 'Office OpenXML Extract' and
'WordprocessingML Process' pipelines in Content Processing.  

 

These are not enabled by default when you install Content Processing, so
you will have to attach them to your domain.  These 2 pipelines, along
with 'Status Change Handling', will process Word 2007 documents saved to
the Server.

 

Office Open XML Extract:  extracts the parts from a .docx package into a
directory named for the originating file.

WordprocessingML process:   updates document.xml (extracted from every
.docx package), by merging text split across runs (w:r elements) to
help improve search results and clean up the content for repurposing.

 

It's  also easy to assemble Word documents on the server by using the
xdmp:zip* utilities.

 

Hope this helps,

Pete

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Joydeep_Sinha
Sent: Wednesday, November 12, 2008 10:44 AM
To: general@developer.marklogic.com
Cc: Vivek_Nagasundara; Thangavelu_Senniyappan
Subject: [MarkLogic Dev General] Important: Resolution of an Issue
inMarklogic
Importance: High

 

HI All,

 

I am from Satyam Computer Services Limited and we generally build
solutions on top of Marklogic. Currently we are using Marklogic to
upload docx files (MS Office 2007 formats) but are unaware of the
conversion capabilities of Marklogic to xhtml/xml components. Please
confirm how we can allow ingestion of docx formats into Marklogic and
how the latest version of Marklogic would support handling of the latest
Office formats.

 

It would be great, if you all can provide us the exact Xquery for
handling such issue or inform the change which would be required so as
to allow Office 2007 formats ingestion and retrieval to and from
Marklogic.

 

A quick resolution, would be greatly appreciated.

 

Thanks and Regards,

Joydeep Sinha

Onsite Co-ordinator  - IDMF PoC

Media and Entertainment - Solution Offerings

Satyam Computer Services Limited.

Mobile - (001)-6103020388

 

 



DISCLAIMER:
This email (including any attachments) is intended for the sole use of
the intended recipient/s and may

RE: [MarkLogic Dev General] Important: Resolution of an Issue inMarklogic

2008-11-12 Thread Pete Aven
Hi Joydeep,

 

In MarkLogic Server 4.0, check out the 'Office OpenXML Extract' and
'WordprocessingML Process' pipelines in Content Processing.  

 

These are not enabled by default when you install Content Processing, so
you will have to attach them to your domain.  These 2 pipelines, along
with 'Status Change Handling', will process Word 2007 documents saved to
the Server.

 

Office Open XML Extract:  extracts the parts from a .docx package into a
directory named for the originating file.

WordprocessingML process:   updates document.xml (extracted from every
.docx package), by merging text split across runs (w:r elements) to
help improve search results and clean up the content for repurposing.

 

It's  also easy to assemble Word documents on the server by using the
xdmp:zip* utilities.

 

Hope this helps,

Pete

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Joydeep_Sinha
Sent: Wednesday, November 12, 2008 10:44 AM
To: general@developer.marklogic.com
Cc: Vivek_Nagasundara; Thangavelu_Senniyappan
Subject: [MarkLogic Dev General] Important: Resolution of an Issue
inMarklogic
Importance: High

 

HI All,

 

I am from Satyam Computer Services Limited and we generally build
solutions on top of Marklogic. Currently we are using Marklogic to
upload docx files (MS Office 2007 formats) but are unaware of the
conversion capabilities of Marklogic to xhtml/xml components. Please
confirm how we can allow ingestion of docx formats into Marklogic and
how the latest version of Marklogic would support handling of the latest
Office formats.

 

It would be great, if you all can provide us the exact Xquery for
handling such issue or inform the change which would be required so as
to allow Office 2007 formats ingestion and retrieval to and from
Marklogic.

 

A quick resolution, would be greatly appreciated.

 

Thanks and Regards,

Joydeep Sinha

Onsite Co-ordinator  - IDMF PoC

Media and Entertainment - Solution Offerings

Satyam Computer Services Limited.

Mobile - (001)-6103020388

 

 



DISCLAIMER:
This email (including any attachments) is intended for the sole use of
the intended recipient/s and may contain material that is CONFIDENTIAL
AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or
copying or distribution or forwarding of any or all of the contents in
this message is STRICTLY PROHIBITED. If you are not the intended
recipient, please contact the sender by email and delete all copies;
your cooperation in this regard is appreciated.

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Creating OOXml structured powerpointpresentations

2008-07-22 Thread Pete Aven
Hi Isha,

 

Yes, you can create PowerPoint 2007 presentations in Mark Logic. 

 

The .pptx package requires different files and a different directory
structure than a Word package (.docx).  You will require some basic
understanding of the package structure and the file content types if you
wish to manipulate a .pptx.  I'd suggest taking a look at the Open XML
Explained .pdf for the basics of PresentationML and DrawingML.  

 

http://openxmldeveloper.org/articles/1970.aspx

 

Hope this helps,

Pete

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Isha
Chadha
Sent: Tuesday, July 22, 2008 3:06 AM
To: 'General Mark Logic Developer Discussion'
Subject: [MarkLogic Dev General] Creating OOXml structured
powerpointpresentations

 

 

Hi,

 

Can we create PowerPoint (*pptx) presentations within mark logic. I have
gone through following link:

 

http://developer.marklogic.com/columns/smallchanges/2007-11-27.xqy

 

I was able to create word document as mentioned. I tried the same for
PowerPoint. It appears that in order to create it, one should be well
versed with the OOXml structure (PresentationML+DrawingML).

 

Is this the only way?

 

Help in this case is highly appreciated.

 

Regards, 

Isha

 

 

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general