Re: Document Splitter
Cool. I'll shift to that approach. Have a lot of cases were we are indexing a csv, xml, or json file where we want them split up. -- Michael Cizmar Managing Director p: 312.585.6396 d: 312.585.6286 twitter: @michaelcizmar<http://twitter.com/michaelcizmar> http://www.mcplusa.com/ The information contained in this communication is confidential, private, proprietary, or otherwise privileged and is intended only for the use of the addressee. This e-mail is intended only for the person or entity to whom it is directed. Unauthorized use, disclosure, distribution or copying is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify us immediately and permanently delete this e-mail and any attachments. From: Karl Wright Sent: Wednesday, July 8, 2020 4:43 PM To: dev Subject: Re: Document Splitter Hi all, Julien is correct; all documents must originate in the document repository. You can create document components this way, but they're all subsidiaries of the principle document, so really the framework only tracks the principle document in that case. So you have a choice: either use the component approach, or have each row be a full document in its own right. >From what I see, the component approach would be the best one. Karl On Wed, Jul 8, 2020 at 1:25 PM Michael Cizmar wrote: > Good point, I was thinking that I could do a: > return activities.sendDocument(documentURI,docCopy); > > For each row of the XML or JSON. > > > > > From: julien.massi...@francelabs.com > Sent: Wednesday, July 8, 2020 9:45 AM > To: dev@manifoldcf.apache.org > Subject: RE: Document Splitter > > Hi Michael, > > if I am not wrong (and that Karl confirms), what you want to do is not > possible in a transformation connector. A transformation connector cannot > transform 1 incoming document into several ones. The only way to do that is > in a repository connector but it would then be bound to the type of the > repo source. > > Regards, > Julien > > -Message d'origine- > De : Karl Wright > Envoyé : mercredi 8 juillet 2020 16:16 > À : dev > Objet : Re: Document Splitter > > Not that I know of. But I'll let others answer as to what they may have > written. > Karl > > > On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar > wrote: > > > I have a Json file which has an array of objects that I want to index > > as separate documents. Before I build a transformer to split it, is > > there a ready made transformer to do this? > > > > Thanks! > > > > Michael > > > >
Re: Document Splitter
Hi all, Julien is correct; all documents must originate in the document repository. You can create document components this way, but they're all subsidiaries of the principle document, so really the framework only tracks the principle document in that case. So you have a choice: either use the component approach, or have each row be a full document in its own right. >From what I see, the component approach would be the best one. Karl On Wed, Jul 8, 2020 at 1:25 PM Michael Cizmar wrote: > Good point, I was thinking that I could do a: > return activities.sendDocument(documentURI,docCopy); > > For each row of the XML or JSON. > > > > > From: julien.massi...@francelabs.com > Sent: Wednesday, July 8, 2020 9:45 AM > To: dev@manifoldcf.apache.org > Subject: RE: Document Splitter > > Hi Michael, > > if I am not wrong (and that Karl confirms), what you want to do is not > possible in a transformation connector. A transformation connector cannot > transform 1 incoming document into several ones. The only way to do that is > in a repository connector but it would then be bound to the type of the > repo source. > > Regards, > Julien > > -Message d'origine- > De : Karl Wright > Envoyé : mercredi 8 juillet 2020 16:16 > À : dev > Objet : Re: Document Splitter > > Not that I know of. But I'll let others answer as to what they may have > written. > Karl > > > On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar > wrote: > > > I have a Json file which has an array of objects that I want to index > > as separate documents. Before I build a transformer to split it, is > > there a ready made transformer to do this? > > > > Thanks! > > > > Michael > > > >
Re: Document Splitter
Good point, I was thinking that I could do a: return activities.sendDocument(documentURI,docCopy); For each row of the XML or JSON. From: julien.massi...@francelabs.com Sent: Wednesday, July 8, 2020 9:45 AM To: dev@manifoldcf.apache.org Subject: RE: Document Splitter Hi Michael, if I am not wrong (and that Karl confirms), what you want to do is not possible in a transformation connector. A transformation connector cannot transform 1 incoming document into several ones. The only way to do that is in a repository connector but it would then be bound to the type of the repo source. Regards, Julien -Message d'origine- De : Karl Wright Envoyé : mercredi 8 juillet 2020 16:16 À : dev Objet : Re: Document Splitter Not that I know of. But I'll let others answer as to what they may have written. Karl On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar wrote: > I have a Json file which has an array of objects that I want to index > as separate documents. Before I build a transformer to split it, is > there a ready made transformer to do this? > > Thanks! > > Michael >
RE: Document Splitter
Hi Michael, if I am not wrong (and that Karl confirms), what you want to do is not possible in a transformation connector. A transformation connector cannot transform 1 incoming document into several ones. The only way to do that is in a repository connector but it would then be bound to the type of the repo source. Regards, Julien -Message d'origine- De : Karl Wright Envoyé : mercredi 8 juillet 2020 16:16 À : dev Objet : Re: Document Splitter Not that I know of. But I'll let others answer as to what they may have written. Karl On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar wrote: > I have a Json file which has an array of objects that I want to index > as separate documents. Before I build a transformer to split it, is > there a ready made transformer to do this? > > Thanks! > > Michael >
Re: Document Splitter
Not that I know of. But I'll let others answer as to what they may have written. Karl On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar wrote: > I have a Json file which has an array of objects that I want to index as > separate documents. Before I build a transformer to split it, is there a > ready made transformer to do this? > > Thanks! > > Michael >