Re: [libreoffice-users] Specialty Dictionaries
Hi :) It sounds like interesting and useful functionality that might well be worth adding if it hasn't been done already. I think a lot of us here have been focussing on work-arounds to just get the job done. But i think it might be a good idea to post a bug-report and make it a Feature request. Regards from Tom :) --- On Fri, 6/7/12, Simon Cropper simoncrop...@fossworkflowguides.com wrote: From: Simon Cropper simoncrop...@fossworkflowguides.com Subject: Re: [libreoffice-users] Specialty Dictionaries To: users@global.libreoffice.org Date: Friday, 6 July, 2012, 1:19 On 06/07/12 05:26, nvrk wrote: On Thu, Jul 5, 2012 at 3:06 AM, Simon Cropper simoncrop...@fossworkflowguides.com wrote: Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. An underscore works well. Yeap, but an underscore gets used a lot in technical reports (e.g. in a URL). If you do a global search and replace to remove the character at the end of writing so the report looks clean and well presented you neuter the URL or corrupt the other text string that uses it. As an alternative to free form typing of jargon or technical terms then running a spell checker, terms could be inserted from a list. This works OK but in the absence of LO integration you can't flag the inserted text as 'hey this is jargon, I just inserted it from a secure source, don't bother spell checking'. This can be done but requires you to manually apply language characteristics to hundreds or thousands of names, or alternatively hit ignore the same number of times with the spell checker. :( Ideally you need a blank concatenation character that is recognizing by LO as linking two words (such as a non-breaking space already available in LO but does not necessarily have to physically bind the words together but would need to be seen by the spell checker as a joining character) AND IS RECOGNIZED by the spell checker, substituted with something like an underscore and compared to the lists in the dic files which would appear as Eucalyptus_vulgaris. I just need someone in the know to be able to insert this functionality and these problems would be solved. -- Cheers Simon Simon Cropper - Open Content Creator Free and Open Source Software Workflow Guides Introduction http://www.fossworkflowguides.com GIS Packages http://www.fossworkflowguides.com/gis bash / Python http://www.fossworkflowguides.com/scripting -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Specialty Dictionaries
I wonder how hard it would be, or how confusing it would be to make a 2 word reference in the word list file. If the space appears blank, then what would happen if there was some other blank character that looked like the space character. I think the originally defined system came from some other organization, like Hunspell or Myspell. It may take a lot of work, and make all of the current .oxt dictionaries need to be re-done if there was a new formatting system that allowed a 2 word listing. My question is why would one be needed? It the only reason for it is one combination of terms is valid, while a different ending term is not? That would make some real context oriented spelling and worse that figuring out grammar issues. You will need to have defines all the possible two word term combinations instead of having each single word/term checked for proper spelling. The more I think about it, the more I want to cringe over what might be needed to get it working fully and not mess with defined characters for non-English fonts or other issues for taking over a pre-defined character for some internal function, when that character may be mapped for a glyph/letter for some font used by some user. Then that user[s] would have their document messed up in some way. Still, if someone can figure out all the issues and work out a way that they will not cause a set of users problems down the line, then by all means create a request for a modification of the spell checker system. I really doubt that it will be changed form the current system, but you can try. On 07/07/2012 06:22 PM, Tom Davies wrote: Hi :) It sounds like interesting and useful functionality that might well be worth adding if it hasn't been done already. I think a lot of us here have been focussing on work-arounds to just get the job done. But i think it might be a good idea to post a bug-report and make it a Feature request. Regards from Tom :) --- On Fri, 6/7/12, Simon Cropper simoncrop...@fossworkflowguides.com wrote: From: Simon Cropper simoncrop...@fossworkflowguides.com Subject: Re: [libreoffice-users] Specialty Dictionaries To: users@global.libreoffice.org Date: Friday, 6 July, 2012, 1:19 On 06/07/12 05:26, nvrk wrote: On Thu, Jul 5, 2012 at 3:06 AM, Simon Cropper simoncrop...@fossworkflowguides.com wrote: Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. An underscore works well. Yeap, but an underscore gets used a lot in technical reports (e.g. in a URL). If you do a global search and replace to remove the character at the end of writing so the report looks clean and well presented you neuter the URL or corrupt the other text string that uses it. As an alternative to free form typing of jargon or technical terms then running a spell checker, terms could be inserted from a list. This works OK but in the absence of LO integration you can't flag the inserted text as 'hey this is jargon, I just inserted it from a secure source, don't bother spell checking'. This can be done but requires you to manually apply language characteristics to hundreds or thousands of names, or alternatively hit ignore the same number of times with the spell checker. :( Ideally you need a blank concatenation character that is recognizing by LO as linking two words (such as a non-breaking space already available in LO but does not necessarily have to physically bind the words together but would need to be seen by the spell checker as a joining character) AND IS RECOGNIZED by the spell checker, substituted with something like an underscore and compared to the lists in the dic files which would appear as Eucalyptus_vulgaris. I just need someone in the know to be able to insert this functionality and these problems would be solved. -- Cheers Simon Simon Cropper - Open Content Creator Free and Open Source Software Workflow Guides Introduction http://www.fossworkflowguides.com GIS Packages http://www.fossworkflowguides.com/gis bash / Pythonhttp://www.fossworkflowguides.com/scripting -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org
Re: [libreoffice-users] Specialty Dictionaries
Hi :) I think i would create a new specialist list and add 2 or 3 words to it. Then look for the file to see what format it uses and then copypaste tons of words in at a time. For combined words i would add a - in the middle, eg Eucalyptus-vulgaris but i think that is a bit of a kludge. There are probably much more elegant ways which others will probably go into and they may have good ideas about combined words too. Btw i tend to use ' for sarcastic or cynical statements and for quotes. So it 'should' work = it probably wont work but 'experts' say it will. Regards from Tom :) --- On Thu, 5/7/12, Simon Cropper simoncrop...@fossworkflowguides.com wrote: From: Simon Cropper simoncrop...@fossworkflowguides.com Subject: [libreoffice-users] Specialty Dictionaries To: users@global.libreoffice.org Date: Thursday, 5 July, 2012, 7:14 Hi All, I saw over the last month discussions regarding special dictionaries. What became of this? How easy is it to create special dictionaries? Are there any resources regarding their construction? I know you can import 1 by 1 but I have 20,000 items to add. Also are composite words addressed in these dictionaries? I have need for a dictionary that searches for and matches binomials. Say, fictitiously, I have a plant called 'Eucalyptus vulgaris', I want the dictionary to see the binomial not 'Eucalyptus' or 'vulgaris' separately. Both these individual words are quite common but the combo is unique (i.e. their are multiple eucalypts and multiple species with vulgaris as a species epithet). -- Cheers Simon Simon Cropper - Open Content Creator Free and Open Source Software Workflow Guides Introduction http://www.fossworkflowguides.com GIS Packages http://www.fossworkflowguides.com/gis bash / Python http://www.fossworkflowguides.com/scripting -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Specialty Dictionaries
Tom, Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. On 05/07/12 17:21, Tom Davies wrote: Hi :) I think i would create a new specialist list and add 2 or 3 words to it. Then look for the file to see what format it uses and then copypaste tons of words in at a time. For combined words i would add a - in the middle, eg Eucalyptus-vulgaris but i think that is a bit of a kludge. There are probably much more elegant ways which others will probably go into and they may have good ideas about combined words too. Btw i tend to use ' for sarcastic or cynical statements and for quotes. So it 'should' work = it probably wont work but 'experts' say it will. Regards from Tom :) --- On Thu, 5/7/12, Simon Cropper simoncrop...@fossworkflowguides.com wrote: From: Simon Cropper simoncrop...@fossworkflowguides.com Subject: [libreoffice-users] Specialty Dictionaries To: users@global.libreoffice.org Date: Thursday, 5 July, 2012, 7:14 Hi All, I saw over the last month discussions regarding special dictionaries. What became of this? How easy is it to create special dictionaries? Are there any resources regarding their construction? I know you can import 1 by 1 but I have 20,000 items to add. Also are composite words addressed in these dictionaries? I have need for a dictionary that searches for and matches binomials. Say, fictitiously, I have a plant called 'Eucalyptus vulgaris', I want the dictionary to see the binomial not 'Eucalyptus' or 'vulgaris' separately. Both these individual words are quite common but the combo is unique (i.e. their are multiple eucalypts and multiple species with vulgaris as a species epithet). -- Cheers Simon -- Cheers Simon Simon Cropper - Open Content Creator Free and Open Source Software Workflow Guides Introduction http://www.fossworkflowguides.com GIS Packages http://www.fossworkflowguides.com/gis bash / Pythonhttp://www.fossworkflowguides.com/scripting -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Specialty Dictionaries
With the word list system for LO's dictionaries, I have found no info regards how to used a two word combo. All I have been able to do is have a word list in a .dic file withing an .oxt file. Having a term with two parts, like your example, is something I have not found how to do. As for creating a 20,000 word specialized dictionary, you can get with me off the list and I can help you create it. On 07/05/2012 06:06 AM, Simon Cropper wrote: Tom, Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. On 05/07/12 17:21, Tom Davies wrote: Hi :) I think i would create a new specialist list and add 2 or 3 words to it. Then look for the file to see what format it uses and then copypaste tons of words in at a time. For combined words i would add a - in the middle, eg Eucalyptus-vulgaris but i think that is a bit of a kludge. There are probably much more elegant ways which others will probably go into and they may have good ideas about combined words too. Btw i tend to use ' for sarcastic or cynical statements and for quotes. So it 'should' work = it probably wont work but 'experts' say it will. Regards from Tom :) --- On Thu, 5/7/12, Simon Cropper simoncrop...@fossworkflowguides.com wrote: From: Simon Cropper simoncrop...@fossworkflowguides.com Subject: [libreoffice-users] Specialty Dictionaries To: users@global.libreoffice.org Date: Thursday, 5 July, 2012, 7:14 Hi All, I saw over the last month discussions regarding special dictionaries. What became of this? How easy is it to create special dictionaries? Are there any resources regarding their construction? I know you can import 1 by 1 but I have 20,000 items to add. Also are composite words addressed in these dictionaries? I have need for a dictionary that searches for and matches binomials. Say, fictitiously, I have a plant called 'Eucalyptus vulgaris', I want the dictionary to see the binomial not 'Eucalyptus' or 'vulgaris' separately. Both these individual words are quite common but the combo is unique (i.e. their are multiple eucalypts and multiple species with vulgaris as a species epithet). -- Cheers Simon -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Specialty Dictionaries
On Thu, Jul 5, 2012 at 3:06 AM, Simon Cropper simoncrop...@fossworkflowguides.com wrote: Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. An underscore works well. -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Specialty Dictionaries
On 06/07/12 05:26, nvrk wrote: On Thu, Jul 5, 2012 at 3:06 AM, Simon Cropper simoncrop...@fossworkflowguides.com wrote: Yeah, I have thought of both these things. Have hacked a standard file before, particularly in MS Word. Easily done assuming it is a text file and not a binary file. The problem is the binomial. I also thought of the concatenation string but most of the single characters have been used for have special meanings in various word processors. Hyphens for example are used in LO as hyphens and so how would you know when removing the character at the end of your report is complete, what is a concatenation character and what is a real hyphen? In other situations I have used =!= as a joining string but as stated it is messy and hard to read. An underscore works well. Yeap, but an underscore gets used a lot in technical reports (e.g. in a URL). If you do a global search and replace to remove the character at the end of writing so the report looks clean and well presented you neuter the URL or corrupt the other text string that uses it. As an alternative to free form typing of jargon or technical terms then running a spell checker, terms could be inserted from a list. This works OK but in the absence of LO integration you can't flag the inserted text as 'hey this is jargon, I just inserted it from a secure source, don't bother spell checking'. This can be done but requires you to manually apply language characteristics to hundreds or thousands of names, or alternatively hit ignore the same number of times with the spell checker. :( Ideally you need a blank concatenation character that is recognizing by LO as linking two words (such as a non-breaking space already available in LO but does not necessarily have to physically bind the words together but would need to be seen by the spell checker as a joining character) AND IS RECOGNIZED by the spell checker, substituted with something like an underscore and compared to the lists in the dic files which would appear as Eucalyptus_vulgaris. I just need someone in the know to be able to insert this functionality and these problems would be solved. -- Cheers Simon Simon Cropper - Open Content Creator Free and Open Source Software Workflow Guides Introduction http://www.fossworkflowguides.com GIS Packages http://www.fossworkflowguides.com/gis bash / Pythonhttp://www.fossworkflowguides.com/scripting -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted