[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12646585#action_12646585 ] Paco Avila commented on JCR-1521: - I'm interested in correct this behavior, but not sure how can I do it. There is more users interested in this issue and want to meet a good solution? I think that a good way to avoid re-index the data is add a new property which contains a checksum for jcr:data property. Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Improvement Components: indexing, jackrabbit-core Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: DummyMyTextExtractor.java, DummyTextExtractor.java, ExifTextExtractor.java, StackTrace.txt I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12589049#action_12589049 ] Jukka Zitting commented on JCR-1521: TextExtractor is executer after Node.save(), Node.checking() and Node.checkout() It's not a bug, it's a feature! :-) All those operations modify the node, causing text extraction to run. There's been some talk about removing the need to reindex binaries if they haven't changed, but so far there hasn't been any good way to implement that. Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: DummyMyTextExtractor.java, DummyTextExtractor.java, ExifTextExtractor.java, StackTrace.txt I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12589107#action_12589107 ] Paco Avila commented on JCR-1521: - Ja, ja... good but it is very annoying force text extraction on checkout. The checkout opperation does no modify any interesting node property which justify node content reindexing. A, from my point of view, good way to avoid reindex binary can be add a checksum property which can be used to check if the binary content is modified. Or put a isModified property with is set to true when user do ContenNode.setProperty(jcr:data, xxx) . Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: DummyMyTextExtractor.java, DummyTextExtractor.java, ExifTextExtractor.java, StackTrace.txt I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12587524#action_12587524 ] Jukka Zitting commented on JCR-1521: A wild guess, are you using WebDAV to import files into the repository? Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: ExifTextExtractor.java I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12587538#action_12587538 ] Paco Avila commented on JCR-1521: - No, this is done through jcr api. Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: ExifTextExtractor.java I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12587194#action_12587194 ] Marcel Reutegger commented on JCR-1521: --- I'm not able to reproduce this issue. Can you please be more specific? What methods do you call? Do you have a test case? Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12587226#action_12587226 ] Paco Avila commented on JCR-1521: - I have uploaded my ExitTextExtractor. This extractor is executed twice each time I put a jpg file into the repository. Used with * jackrabbit-api-1.4.jar * jackrabbit-core-1.4.2.jar * jackrabbit-jcr2spi-1.4.jar * jackrabbit-jcr-commons-1.4.2.jar * jackrabbit-spi-1.4.jar * jackrabbit-spi-commons-1.4.jar * jackrabbit-text-extractors-1.4.jar Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: ExifTextExtractor.java I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (JCR-1521) Text Extractors are executed twice
[ https://issues.apache.org/jira/browse/JCR-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12587462#action_12587462 ] Dave Brosius commented on JCR-1521: --- It might be useful to put this Exception e = new Exception(); e.fillInStackTrace(); e.printStackTrace(); in your extractText method, to show folks where the two calls are coming from. Text Extractors are executed twice -- Key: JCR-1521 URL: https://issues.apache.org/jira/browse/JCR-1521 Project: Jackrabbit Issue Type: Bug Components: jackrabbit-text-extractors Affects Versions: 1.4 Environment: JDK 1.5 Ubuntu Gutsy Reporter: Paco Avila Attachments: ExifTextExtractor.java I'have created a test text extractor and the method extractText() is invoked twice. It it really neccesaty or it's a bug? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.