Re: [MarkLogic Dev General] Extracting ML Documents to a zip file using Java Client Api ?

Sam Mefford Wed, 07 Feb 2018 10:10:33 -0800

I shared an answer in your duplicate SO 
question<https://stackoverflow.com/questions/48636279/is-there-compress-option-for-exportlistener-in-marklogic-java-client-api/48649442>,
 but I'll repeat it here in case it helps anyone.



Your onDocumentReady listener is run for each document, so I'm guessing it 
doesn't make sense to create a new FileOutputStream("F:/Json/file.zip"); for 
each document. That's why you're only seeing the last document when you're 
done. Try moving these two lines to before you initialize your batcher:

                       final FileOutputStream dest = new
                         FileOutputStream("F:/Json/file.zip");
                       final ZipOutputStream out = new ZipOutputStream(new
                         BufferedOutputStream(dest));


That way they'll only run once.

Also, move this until after dmm.stopJob(batcher);:

                      out.close();


Also, surround your listener code in a synchronized(out) {...} block so the 
threads won't overwrite each other as they write to the stream. Remember, your 
listener code is going to run in 10 threads in parallel, so your code in the 
listener needs to be thread-safe.


Sam Mefford
Senior Engineer
MarkLogic Corporation
[email protected]
Cell: +1 801 706 9731
www.marklogic.com<http://www.marklogic.com>

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.


________________________________
From: [email protected] 
<[email protected]> on behalf of C. Yaswanth 
<[email protected]>
Sent: Wednesday, February 7, 2018 7:13 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Extracting ML Documents to a zip file using 
Java Client Api ?

Hi,

I am exporting all the documents from a collection to local directory . Below 
is my code.

    public class Extract {
        static // replace with your MarkLogic Server connection information

        DatabaseClient client =
      DatabaseClientFactory.newClient("x", x,
                                      "x", "x",
                                      Authentication.DIGEST);

        private static String EX_DIR = "F:/JavaExtract";

        // Loading files into the database asynchronously
        public static void exportByQuery() {
        DataMovementManager dmm = client.newDataMovementManager();
            // Construct a directory query with which to drive the job.
            QueryManager qm = client.newQueryManager();
            StringQueryDefinition query = qm.newStringDefinition();
            query.setCollections("GOT");


            // Create and configure the batcher
            QueryBatcher batcher = dmm.newQueryBatcher(query);
            batcher.withBatchSize(10)
            .withThreadCount(1)
            .onUrisReady(
                new ExportListener()
                    .onDocumentReady(doc-> {
                        String uriParts[] = doc.getUri().split("/");
                        try {
                           FileOutputStream dest = new
                                 FileOutputStream("F:/Json/file.zip");
                               ZipOutputStream out = new ZipOutputStream(new
                                 BufferedOutputStream(dest));
                               ZipEntry e = new 
ZipEntry(uriParts[uriParts.length - 1]);
                               out.putNextEntry(e);

                               byte[] data = doc.getContent(
                                       new StringHandle()).toBuffer();
                               doc.getFormat();
                               out.write(data, 0, data.length);
                              out.closeEntry();

                              out.close();

                        } catch (Exception e) {
                            e.printStackTrace();
                        }
                    }))
                   .onQueryFailure( exception -> exception.printStackTrace() );

            dmm.startJob(batcher);

            // Wait for the job to complete, and then stop it.
            batcher.awaitCompletion();
            dmm.stopJob(batcher);
        }

        public static void main(String[] args) {
        exportByQuery();
        }
    }

When i am running it is taking only the last document in `GOT` collection and 
keeping in zip rather than taking all.

Cant figure it out where it where i am doing wrong?

Any Help Is Appreciated

Thanks

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Extracting ML Documents to a zip file using Java Client Api ?

Reply via email to