[tw] Re: Slices with names in non-latin letters?
Ok, I've done simple tests. Adding абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ (a space in the end) to each of the ([\.\w]+) parts let me get this working: Tiddler: [[Сегменты с русскими именами: тесты]] |Slicename|slice content| |Slice name|slice content 2| |Имясегмента|содержимое сегмента 3| |Имя сегмента|содержимое сегмента 4| {{{tiddler [[Сегменты с русскими именами: тесты::Slicename]]}}} tiddler [[Сегменты с русскими именами: тесты::Slicename]] {{{tiddler [[Сегменты с русскими именами: тесты::Slice name]]}}} tiddler [[Сегменты с русскими именами: тесты::Slice name]] {{{tiddler [[Сегменты с русскими именами: тесты::Имясегмента]]}}} tiddler [[Сегменты с русскими именами: тесты::Имясегмента]] {{{tiddler [[Сегменты с русскими именами: тесты::Имя сегмента]]}}} tiddler [[Сегменты с русскими именами: тесты::Имя сегмента]] (each of the four tiddler macro shows the content). But the thing is -- I got this working when I changed the core. First, I wrote a plugin: TiddlyWiki.prototype.slicesRE = /(?:^([\'\/]{0,2})~?([\. \wабвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ ] +)\:\1[\t\x20]*([^\n]*)[\t\x20]*$)|(?:^\|([\'\/]{0,2})~?([\. \wабвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ ] +)\:?\4\|[\t\x20]*([^\|\n]*)[\t\x20]*\|$)/gm; which didn't work. I guess it's because the definition of the slicesRE is changed after slices hashmap is built. Is anybody aware of fast method of rebuilding the slices? Of'course, I can copy the store, than purge the main one, than copy tiddlers back to the main store, but this is bulky for each-startup procedure. On the other hand, I'm going to analyse the syntax and do some tests and then discuss this for the core update, so perhaps the first question is not of that importance. On 12 фев, 16:31, PMario pmari...@gmail.com wrote: On Feb 12, 1:20 pm, Yakov yakov.litvin.publi...@gmail.com wrote: On 11 фев, 17:26, PMario pmari...@gmail.com wrote: On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... (not sure what does the hashmap term means here) So at first glance it seems that it's possible to have slices with any symbols.. Let me know if I miss something. In this case it's just a lookup table [2] to have fast access to a tiddler, based on it's title. see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js... So, what's different with slices? I looked up TiddlyWiki.js and it seems that slices are used the same way.. Although, they can be used differently in other .js parts. ahhh, :) imo nothing. but there may be some chars, that are not allowed within an object element. To handle this an escape mechanism has to be found. I think you should extend your formatter with a plugin and run several tests, to find out if it works ;) -m -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On 11 фев, 17:26, PMario pmari...@gmail.com wrote: On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... (not sure what does the hashmap term means here) So at first glance it seems that it's possible to have slices with any symbols.. Let me know if I miss something. In this case it's just a lookup table [2] to have fast access to a tiddler, based on it's title. see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js... So, what's different with slices? I looked up TiddlyWiki.js and it seems that slices are used the same way.. Although, they can be used differently in other .js parts. -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On Feb 12, 1:20 pm, Yakov yakov.litvin.publi...@gmail.com wrote: On 11 фев, 17:26, PMario pmari...@gmail.com wrote: On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... (not sure what does the hashmap term means here) So at first glance it seems that it's possible to have slices with any symbols.. Let me know if I miss something. In this case it's just a lookup table [2] to have fast access to a tiddler, based on it's title. see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js... So, what's different with slices? I looked up TiddlyWiki.js and it seems that slices are used the same way.. Although, they can be used differently in other .js parts. ahhh, :) imo nothing. but there may be some chars, that are not allowed within an object element. To handle this an escape mechanism has to be found. I think you should extend your formatter with a plugin and run several tests, to find out if it works ;) -m -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... (not sure what does the hashmap term means here) So at first glance it seems that it's possible to have slices with any symbols.. Let me know if I miss something. In this case it's just a lookup table [2] to have fast access to a tiddler, based on it's title. see: https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L34 [2] http://en.wikipedia.org/wiki/Hashmap -m -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On 7 фев, 01:23, PMario pmari...@gmail.com wrote: On Feb 6, 5:27 pm, Yakov yakov.litvin.publi...@gmail.com wrote: [\.\wа-яё] If you have a look at the slices handling, you'll see that they end up as something similar to the following. var x={}; x[Имя] = Иван; console.log(x) So as long as the browsers javascript can access russian object elements, it should be possible. var x={}; x['абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя'] = абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя; console.log(x) Both tests seem to work. So the browser may digest it. I'm not so sure about TW :) (the а-яё part is russian alphabet). The other question is -- is it safe to add \- and \s to this part (as they can be parts of data names, like in long-term plans)? If you add spaces and minus to object elements, i bet you'll have big troubles. ... In fact, it seems that if someone remembers what are the limitations here, [\.\w] can be substituted with [^something] where something is those symbols that shouldn't be in the slicename. In this case, this can go to the core.. (if the RegExp is the *only* problem here). If this will be core, I'd expect someone has to make ultra havy testing firs ;) -m I'll definitely do some tests when I get enough time. The main thought in my head about this is: tiddlers have titles with almost any symbols (like in [1], title is just a line!), and they are stored as properies of the tiddlers object [2]! (not sure what does the hashmap term means here) So at first glance it seems that it's possible to have slices with any symbols.. Let me know if I miss something. [1] https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L19 [2] https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L7 -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
Is it only RegExp which causes the problem? If so, I think I can make a small patch which would substitue the current RegExp with the one containing [\.\wа-яё] instead of [\.\w] (the а-яё part is russian alphabet). The other question is -- is it safe to add \- and \s to this part (as they can be parts of data names, like in long-term plans)? In fact, it seems that if someone remembers what are the limitations here, [\.\w] can be substituted with [^something] where something is those symbols that shouldn't be in the slicename. In this case, this can go to the core.. (if the RegExp is the *only* problem here). On 5 фев, 21:56, Eric Shulman elsdes...@gmail.com wrote: On Feb 5, 9:56 am, Yakov yakov.litvin.publi...@gmail.com wrote: Some time ago I explored that non-latin letters make two-colomn rows of tables to be not slices. For instance, if I write |Имя|Иван| (which is |Name|Ivan|) I can not use it as a slice (for instance, in transclusion macros). ... is it possible that slices part of the core, or GridPlugin will be refactored so that non-latin name will also be available? The TWCore uses a regexp text pattern to parse slice defintions embedded in tiddler content. Here's the pattern used by the core: (?:^([\'\/]{0,2})~?([\.\w]+)\:\1[\t\x20]*([^\n]+)[\t\x20]*$)|(?:^\| ([\'\/]{0,2})~?([\.\w]+)\:?\4\|[\t\x20]*([^\n]+)[\t\x20]*\|$) This pattern actually matches *two* alternatives for the slice- defininition syntax using either name:value or |name|value|. Of course, regexp can be incredibly painful to read and understand... so, here's a break out of the parts of this pattern: (?: start name:value syntax ^ start of line ([\'\/]{0,2}) optional start bold or italic formatting ~? optional non-wikiword prefix ([\.\w]+) slice name \: : \1 optional end bold or italic (matching above) [\t\x20]* optional leading whitespace ([^\n]+) slice value [\t\x20]* optional trailing whitespace $ end of line ) end name:value syntax | (?: start |name|value| syntax ^ start of line \| table cell boundary ([\'\/]{0,2}) optional start bold or italic formatting ~? optional non-wikiword prefix ([\.\w]+) slice name \:? optional : \4 optional end bold or italic formatting (matching above) \| table cell boundary [\t\x20]* optional leading whitespace ([^\n]+) slice value [\t\x20]* optional trailing whitespace \| table cell boundery $ end of line ) end |name|value| syntax As you can see above, the slice *name* pattern is: [\.\w]+ which matches one or more occurences of . (any char except for newline) or \w (any 'word' character = upper/lower letters, numbers, or underline). This slice name pattern works successfully for standard latin character sets. However, as you noted, it doesn't seem to work when applied to non-latin character sets. I'm guessing that the problem arises because . and \w only match single-byte characters, but the non-latin characters are using multi- byte encoding. Unfortunately, although I'd hope that any decent I18N- ready browser should handle multi-byte encodings properly, this might be a limitation of the browser's internal regexp processing. Still, you might be able to play around with the regexp pattern to use hex codes (\xNN) to match the symbols of the non-latin character set... but I suspect it will be VERY ugly :( Sorry I can't offer a more encouraging response at this time. -e Eric Shulman TiddlyTools / ELS Design Studios WAS THIS ANSWER HELPFUL? IF SO, PLEASE MAKE A DONATION http://www.TiddlyTools.com/#Donations note: donations are directly used to pay for food, rent, gas, net connection, etc., so please give generously and often! Professional TiddlyWiki Consulting Services... Analysis, Design, and Custom Solutions: http://www.TiddlyTools.com/#Contact -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On Feb 6, 5:27 pm, Yakov yakov.litvin.publi...@gmail.com wrote: [\.\wа-яё] If you have a look at the slices handling, you'll see that they end up as something similar to the following. var x={}; x[Имя] = Иван; console.log(x) So as long as the browsers javascript can access russian object elements, it should be possible. var x={}; x['абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя'] = абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя; console.log(x) Both tests seem to work. So the browser may digest it. I'm not so sure about TW :) (the а-яё part is russian alphabet). The other question is -- is it safe to add \- and \s to this part (as they can be parts of data names, like in long-term plans)? If you add spaces and minus to object elements, i bet you'll have big troubles. ... In fact, it seems that if someone remembers what are the limitations here, [\.\w] can be substituted with [^something] where something is those symbols that shouldn't be in the slicename. In this case, this can go to the core.. (if the RegExp is the *only* problem here). If this will be core, I'd expect someone has to make ultra havy testing firs ;) -m -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.
[tw] Re: Slices with names in non-latin letters?
On Feb 5, 9:56 am, Yakov yakov.litvin.publi...@gmail.com wrote: Some time ago I explored that non-latin letters make two-colomn rows of tables to be not slices. For instance, if I write |Имя|Иван| (which is |Name|Ivan|) I can not use it as a slice (for instance, in transclusion macros). ... is it possible that slices part of the core, or GridPlugin will be refactored so that non-latin name will also be available? The TWCore uses a regexp text pattern to parse slice defintions embedded in tiddler content. Here's the pattern used by the core: (?:^([\'\/]{0,2})~?([\.\w]+)\:\1[\t\x20]*([^\n]+)[\t\x20]*$)|(?:^\| ([\'\/]{0,2})~?([\.\w]+)\:?\4\|[\t\x20]*([^\n]+)[\t\x20]*\|$) This pattern actually matches *two* alternatives for the slice- defininition syntax using either name:value or |name|value|. Of course, regexp can be incredibly painful to read and understand... so, here's a break out of the parts of this pattern: (?: start name:value syntax ^ start of line ([\'\/]{0,2}) optional start bold or italic formatting ~?optional non-wikiword prefix ([\.\w]+) slice name \:: \1optional end bold or italic (matching above) [\t\x20]* optional leading whitespace ([^\n]+) slice value [\t\x20]* optional trailing whitespace $ end of line ) end name:value syntax | (?: start |name|value| syntax ^ start of line \|table cell boundary ([\'\/]{0,2}) optional start bold or italic formatting ~?optional non-wikiword prefix ([\.\w]+) slice name \:? optional : \4optional end bold or italic formatting (matching above) \|table cell boundary [\t\x20]* optional leading whitespace ([^\n]+) slice value [\t\x20]* optional trailing whitespace \|table cell boundery $ end of line ) end |name|value| syntax As you can see above, the slice *name* pattern is: [\.\w]+ which matches one or more occurences of . (any char except for newline) or \w (any 'word' character = upper/lower letters, numbers, or underline). This slice name pattern works successfully for standard latin character sets. However, as you noted, it doesn't seem to work when applied to non-latin character sets. I'm guessing that the problem arises because . and \w only match single-byte characters, but the non-latin characters are using multi- byte encoding. Unfortunately, although I'd hope that any decent I18N- ready browser should handle multi-byte encodings properly, this might be a limitation of the browser's internal regexp processing. Still, you might be able to play around with the regexp pattern to use hex codes (\xNN) to match the symbols of the non-latin character set... but I suspect it will be VERY ugly :( Sorry I can't offer a more encouraging response at this time. -e Eric Shulman TiddlyTools / ELS Design Studios WAS THIS ANSWER HELPFUL? IF SO, PLEASE MAKE A DONATION http://www.TiddlyTools.com/#Donations note: donations are directly used to pay for food, rent, gas, net connection, etc., so please give generously and often! Professional TiddlyWiki Consulting Services... Analysis, Design, and Custom Solutions: http://www.TiddlyTools.com/#Contact -- You received this message because you are subscribed to the Google Groups TiddlyWiki group. To post to this group, send email to tiddlywiki@googlegroups.com. To unsubscribe from this group, send email to tiddlywiki+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tiddlywiki?hl=en.