Re: [xwiki-devs] Office Importer - Importing into multiple wiki pages

Ludovic Dubost Tue, 10 Mar 2009 08:13:30 -0700

Sergiu Dumitriu a écrit :
> Asiri Rathnayake wrote:
>   
>> Hi devs,
>>
>> To implement the above functionality I have created the following UI:
>> http://i43.tinypic.com/28l7x2u.png which was dervied from the mockups
>> located at
>> http://incubator.myxwiki.org/xwiki/bin/view/Mockups/ImportCompositeDocument
>>     
>
> I don't like this mockup. This should be part of our standard Office 
> import feature, and as such it should provide the simplest interface 
> possible for basic users, without cluttering the UI with regexps and 
> "Leave empty if..." messages. Instead, this should be split into 2 
> dialogs, one with the file input field,  and checkboxes for "Clean the 
> document styles to better match the content of the wiki" and "Split the 
> document into several wiki pages". If the second checkbox is selected, 
> then when pressing Next the second dialog appears, and the user can 
> select the rest of the information.
>


We can make alternative proposals. We just need to propose them.
We plan to have somebody from the usability team spend some time on this.

Ludovic
>   
>> Descriptions of various fields are as follows:
>>
>> * Document - The office document to be uploaded (and imported)
>>
>> * Style filtering - Whether to filter office styles or not
>>
>> * Heading level to split - If the user wishes to split the imported document
>> into multiple wiki pages, he has to select the heading level (h1, h2, h3...
>> h6) to be used when splitting the document. If the user does not select a
>> heading level, the document will be imported as it is (no splitting).
>>
>> * Custom split regex - If the user wants to further refine the split
>> criterion (based on the content of header) this field allows him to specify
>> that criterion through a regular expression.
>>
>>     Example regular expression: <b>Section<b>.*
>>
>>     Open Question: Aren't regular expressions bit too technical for users?
>>
>> * Target space - This is where the resulting document(s) will land.
>>
>> * Target (master) page - The main document holding the TOC (in case of
>> splitting), otherwise this is the name of resulting wiki page.
>>
>> * Child pages naming method - If the document is split into multiple pages,
>> pages should be named according to some criterion. This combo box allows
>> users to specify that criterion.
>>
>>
>>
>> Regarding the implementation, we have two possible approaches.
>>
>> 1. Implement the splitting in w3c dom level (xhtml)
>> 2. Implement the splitting in XDOM level
>>
>> * In the first approach we will navigate through the child elements directly
>> under <body> tag and find matching heading elements. For the regex, we will
>> have to serialize the heading element so that the regex can be evaluated.
>> Heading elements can be serialized as explained here:
>> http://forums.sun.com/thread.jspa?threadID=698475
>>
>> * In the second approach we can either use XDOM operations or use a
>> SplittingChainingListener. But I don't know whether regex matching is
>> possible with this scheme.
>>
>> Also, regardless of the method we follow, there will be a problem with large
>> office documents (say 100MB or so). Loading such a file into memory (dom or
>> xdom) would not be a good idea.
>>
>> I haven't decided which method to go with yet. So it will be really great if
>> we can sort this out as soon as possible.
>>
>>     
>
> 3. At the SAX level, since SAX doesn't load the whole document in 
> memory. This is the best option if we want to consider large documents 
> and memory consumption. This, however, will make the splitter harder to 
> integrate in the current rendering engine.
>
>   


-- 
Ludovic Dubost
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] Office Importer - Importing into multiple wiki pages

Reply via email to