[tw] Re: Slices with names in non-latin letters?

2012-02-24 Thread Yakov
Ok, I've done simple tests. Adding

абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ

(a space in the end) to each of the ([\.\w]+) parts let me get this
working:

 Tiddler: [[Сегменты с русскими именами: тесты]] 
|Slicename|slice content|
|Slice name|slice content 2|
|Имясегмента|содержимое сегмента 3|
|Имя сегмента|содержимое сегмента 4|
{{{tiddler [[Сегменты с русскими именами: тесты::Slicename]]}}}
tiddler [[Сегменты с русскими именами: тесты::Slicename]]
{{{tiddler [[Сегменты с русскими именами: тесты::Slice name]]}}}
tiddler [[Сегменты с русскими именами: тесты::Slice name]]
{{{tiddler [[Сегменты с русскими именами: тесты::Имясегмента]]}}}
tiddler [[Сегменты с русскими именами: тесты::Имясегмента]]
{{{tiddler [[Сегменты с русскими именами: тесты::Имя сегмента]]}}}
tiddler [[Сегменты с русскими именами: тесты::Имя сегмента]]

(each of the four tiddler macro shows the content).

But the thing is -- I got this working when I changed the core. First,
I wrote a plugin:

TiddlyWiki.prototype.slicesRE = /(?:^([\'\/]{0,2})~?([\.
\wабвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ ]
+)\:\1[\t\x20]*([^\n]*)[\t\x20]*$)|(?:^\|([\'\/]{0,2})~?([\.
\wабвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ ]
+)\:?\4\|[\t\x20]*([^\|\n]*)[\t\x20]*\|$)/gm;

which didn't work. I guess it's because the definition of the slicesRE
is changed after slices hashmap is built. Is anybody aware of fast
method of rebuilding the slices? Of'course, I can copy the store, than
purge the main one, than copy tiddlers back to the main store, but
this is bulky for each-startup procedure.

On the other hand, I'm going to analyse the syntax and do some tests
and then discuss this for the core update, so perhaps the first
question is not of that importance.

On 12 фев, 16:31, PMario pmari...@gmail.com wrote:
 On Feb 12, 1:20 pm, Yakov yakov.litvin.publi...@gmail.com wrote: On 11 
 фев, 17:26, PMario pmari...@gmail.com wrote:

   On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... 
   (not sure what does the hashmap term
means here) So at first glance it seems that it's possible to have
slices with any symbols.. Let me know if I miss something.

   In this case it's just a lookup table [2] to have fast access to a
   tiddler, based on it's title.
   see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js...

  So, what's different with slices? I looked up TiddlyWiki.js and it
  seems that slices are used the same way.. Although, they can be used
  differently in other .js parts.

 ahhh, :)
 imo nothing. but there may be some chars, that are not allowed within
 an object element. To handle this an escape mechanism has to be found.
 I think you should extend your formatter with a plugin and run several
 tests, to find out if it works ;)

 -m

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-12 Thread Yakov
On 11 фев, 17:26, PMario pmari...@gmail.com wrote:
 On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... (not 
 sure what does the hashmap term
  means here) So at first glance it seems that it's possible to have
  slices with any symbols.. Let me know if I miss something.

 In this case it's just a lookup table [2] to have fast access to a
 tiddler, based on it's title.
 see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js...

So, what's different with slices? I looked up TiddlyWiki.js and it
seems that slices are used the same way.. Although, they can be used
differently in other .js parts.

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-12 Thread PMario
On Feb 12, 1:20 pm, Yakov yakov.litvin.publi...@gmail.com wrote:
 On 11 фев, 17:26, PMario pmari...@gmail.com wrote:

  On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote: ... 
  (not sure what does the hashmap term
   means here) So at first glance it seems that it's possible to have
   slices with any symbols.. Let me know if I miss something.

  In this case it's just a lookup table [2] to have fast access to a
  tiddler, based on it's title.
  see:https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js...

 So, what's different with slices? I looked up TiddlyWiki.js and it
 seems that slices are used the same way.. Although, they can be used
 differently in other .js parts.
ahhh, :)
imo nothing. but there may be some chars, that are not allowed within
an object element. To handle this an escape mechanism has to be found.
I think you should extend your formatter with a plugin and run several
tests, to find out if it works ;)

-m

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-11 Thread PMario
On Feb 10, 10:31 pm, Yakov yakov.litvin.publi...@gmail.com wrote:
 ... (not sure what does the hashmap term
 means here) So at first glance it seems that it's possible to have
 slices with any symbols.. Let me know if I miss something.
In this case it's just a lookup table [2] to have fast access to a
tiddler, based on it's title.
see: https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L34

[2] http://en.wikipedia.org/wiki/Hashmap

-m

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-10 Thread Yakov
On 7 фев, 01:23, PMario pmari...@gmail.com wrote:
 On Feb 6, 5:27 pm, Yakov yakov.litvin.publi...@gmail.com wrote:

  [\.\wа-яё]

 If you have a look at the slices handling, you'll see that they end up
 as something similar to the following.

 var x={};
 x[Имя] = Иван;
 console.log(x)

 So as long as the browsers javascript can access russian object
 elements, it should be possible.

 var x={};
 x['абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя']
 =
 абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя;
 console.log(x)

 Both tests seem to work. So the browser may digest it. I'm not so sure
 about TW :)

  (the а-яё part is russian alphabet). The other question is -- is it
  safe to add \- and \s to this part (as they can be parts of data
  names, like in long-term plans)?

 If you add spaces and minus to object elements, i bet you'll have
 big troubles.

  ... In fact, it seems that if someone
  remembers what are the limitations here, [\.\w] can be substituted
  with [^something] where something is those symbols that shouldn't be
  in the slicename. In this case, this can go to the core.. (if the
  RegExp is the *only* problem here).

 If this will be core, I'd expect someone has to make ultra havy
 testing firs ;)

 -m

I'll definitely do some tests when I get enough time. The main thought
in my head about this is: tiddlers have titles with almost any symbols
(like in [1], title is just a line!), and they are stored as properies
of the tiddlers object [2]! (not sure what does the hashmap term
means here) So at first glance it seems that it's possible to have
slices with any symbols.. Let me know if I miss something.

[1] https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L19
[2] https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/TiddlyWiki.js#L7

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-06 Thread Yakov
Is it only RegExp which causes the problem? If so, I think I can make
a small patch which would substitue the current RegExp with the one
containing

[\.\wа-яё]

instead of

[\.\w]

(the а-яё part is russian alphabet). The other question is -- is it
safe to add \- and \s to this part (as they can be parts of data
names, like in long-term plans)? In fact, it seems that if someone
remembers what are the limitations here, [\.\w] can be substituted
with [^something] where something is those symbols that shouldn't be
in the slicename. In this case, this can go to the core.. (if the
RegExp is the *only* problem here).

On 5 фев, 21:56, Eric Shulman elsdes...@gmail.com wrote:
 On Feb 5, 9:56 am, Yakov yakov.litvin.publi...@gmail.com wrote:

  Some time ago I explored that non-latin letters make two-colomn rows
  of tables to be not slices. For instance, if I write
  |Имя|Иван|
  (which is |Name|Ivan|) I can not use it as a slice (for instance, in
  transclusion macros).
 ...
  is it possible that slices part of the core, or GridPlugin will be
  refactored so that non-latin name will also be available?

 The TWCore uses a regexp text pattern to parse slice defintions
 embedded in tiddler content.  Here's the pattern used by the core:
 
 (?:^([\'\/]{0,2})~?([\.\w]+)\:\1[\t\x20]*([^\n]+)[\t\x20]*$)|(?:^\|
 ([\'\/]{0,2})~?([\.\w]+)\:?\4\|[\t\x20]*([^\n]+)[\t\x20]*\|$)
 

 This pattern actually matches *two* alternatives for the slice-
 defininition syntax using either name:value or |name|value|.  Of
 course, regexp can be incredibly painful to read and understand... so,
 here's a break out of the parts of this pattern:
 
 (?:             start name:value syntax
   ^             start of line
   ([\'\/]{0,2}) optional start bold or italic formatting
   ~?            optional non-wikiword prefix
   ([\.\w]+)     slice name
   \:            :
   \1            optional end bold or italic (matching above)
   [\t\x20]*     optional leading whitespace
   ([^\n]+)      slice value
   [\t\x20]*     optional trailing whitespace
   $             end of line
 )               end name:value syntax
 |
 (?:             start |name|value| syntax
   ^             start of line
   \|            table cell boundary
   ([\'\/]{0,2}) optional start bold or italic formatting
   ~?            optional non-wikiword prefix
   ([\.\w]+)     slice name
   \:?           optional :
   \4            optional end bold or italic formatting (matching above)
   \|            table cell boundary
   [\t\x20]*     optional leading whitespace
   ([^\n]+)      slice value
   [\t\x20]*     optional trailing whitespace
   \|            table cell boundery
   $             end of line
 )               end |name|value| syntax
 

 As you can see above, the slice *name* pattern is:
    [\.\w]+
 which matches one or more occurences of . (any char except for
 newline) or \w (any 'word' character = upper/lower letters, numbers,
 or underline).  This slice name pattern works successfully for
 standard latin character sets.  However, as you noted, it doesn't
 seem to work when applied to non-latin character sets.

 I'm guessing that the problem arises because . and \w only match
 single-byte characters, but the non-latin characters are using multi-
 byte encoding.  Unfortunately, although I'd hope that any decent I18N-
 ready browser should handle multi-byte encodings properly, this might
 be a limitation of the browser's internal regexp processing.

 Still, you might be able to play around with the regexp pattern to use
 hex codes (\xNN) to match the symbols of the non-latin character
 set... but I suspect it will be VERY ugly :(

 Sorry I can't offer a more encouraging response at this time.

 -e
 Eric Shulman
 TiddlyTools / ELS Design Studios

 
 WAS THIS ANSWER HELPFUL?  IF SO, PLEASE MAKE A DONATION
    http://www.TiddlyTools.com/#Donations
 note: donations are directly used to pay for food, rent,
 gas, net connection, etc., so please give generously and often!

 Professional TiddlyWiki Consulting Services...
 Analysis, Design, and Custom Solutions:
    http://www.TiddlyTools.com/#Contact

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-06 Thread PMario
On Feb 6, 5:27 pm, Yakov yakov.litvin.publi...@gmail.com wrote:
 [\.\wа-яё]

If you have a look at the slices handling, you'll see that they end up
as something similar to the following.

var x={};
x[Имя] = Иван;
console.log(x)

So as long as the browsers javascript can access russian object
elements, it should be possible.

var x={};
x['абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя']
=
абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя;
console.log(x)

Both tests seem to work. So the browser may digest it. I'm not so sure
about TW :)

 (the а-яё part is russian alphabet). The other question is -- is it
 safe to add \- and \s to this part (as they can be parts of data
 names, like in long-term plans)?
If you add spaces and minus to object elements, i bet you'll have
big troubles.

 ... In fact, it seems that if someone
 remembers what are the limitations here, [\.\w] can be substituted
 with [^something] where something is those symbols that shouldn't be
 in the slicename. In this case, this can go to the core.. (if the
 RegExp is the *only* problem here).
If this will be core, I'd expect someone has to make ultra havy
testing firs ;)

-m

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.



[tw] Re: Slices with names in non-latin letters?

2012-02-05 Thread Eric Shulman
On Feb 5, 9:56 am, Yakov yakov.litvin.publi...@gmail.com wrote:
 Some time ago I explored that non-latin letters make two-colomn rows
 of tables to be not slices. For instance, if I write
 |Имя|Иван|
 (which is |Name|Ivan|) I can not use it as a slice (for instance, in
 transclusion macros).
...
 is it possible that slices part of the core, or GridPlugin will be
 refactored so that non-latin name will also be available?

The TWCore uses a regexp text pattern to parse slice defintions
embedded in tiddler content.  Here's the pattern used by the core:

(?:^([\'\/]{0,2})~?([\.\w]+)\:\1[\t\x20]*([^\n]+)[\t\x20]*$)|(?:^\|
([\'\/]{0,2})~?([\.\w]+)\:?\4\|[\t\x20]*([^\n]+)[\t\x20]*\|$)


This pattern actually matches *two* alternatives for the slice-
defininition syntax using either name:value or |name|value|.  Of
course, regexp can be incredibly painful to read and understand... so,
here's a break out of the parts of this pattern:

(?: start name:value syntax
  ^ start of line
  ([\'\/]{0,2}) optional start bold or italic formatting
  ~?optional non-wikiword prefix
  ([\.\w]+) slice name
  \::
  \1optional end bold or italic (matching above)
  [\t\x20]* optional leading whitespace
  ([^\n]+)  slice value
  [\t\x20]* optional trailing whitespace
  $ end of line
)   end name:value syntax
|
(?: start |name|value| syntax
  ^ start of line
  \|table cell boundary
  ([\'\/]{0,2}) optional start bold or italic formatting
  ~?optional non-wikiword prefix
  ([\.\w]+) slice name
  \:?   optional :
  \4optional end bold or italic formatting (matching above)
  \|table cell boundary
  [\t\x20]* optional leading whitespace
  ([^\n]+)  slice value
  [\t\x20]* optional trailing whitespace
  \|table cell boundery
  $ end of line
)   end |name|value| syntax


As you can see above, the slice *name* pattern is:
   [\.\w]+
which matches one or more occurences of . (any char except for
newline) or \w (any 'word' character = upper/lower letters, numbers,
or underline).  This slice name pattern works successfully for
standard latin character sets.  However, as you noted, it doesn't
seem to work when applied to non-latin character sets.

I'm guessing that the problem arises because . and \w only match
single-byte characters, but the non-latin characters are using multi-
byte encoding.  Unfortunately, although I'd hope that any decent I18N-
ready browser should handle multi-byte encodings properly, this might
be a limitation of the browser's internal regexp processing.

Still, you might be able to play around with the regexp pattern to use
hex codes (\xNN) to match the symbols of the non-latin character
set... but I suspect it will be VERY ugly :(

Sorry I can't offer a more encouraging response at this time.

-e
Eric Shulman
TiddlyTools / ELS Design Studios


WAS THIS ANSWER HELPFUL?  IF SO, PLEASE MAKE A DONATION
   http://www.TiddlyTools.com/#Donations
note: donations are directly used to pay for food, rent,
gas, net connection, etc., so please give generously and often!

Professional TiddlyWiki Consulting Services...
Analysis, Design, and Custom Solutions:
   http://www.TiddlyTools.com/#Contact

-- 
You received this message because you are subscribed to the Google Groups 
TiddlyWiki group.
To post to this group, send email to tiddlywiki@googlegroups.com.
To unsubscribe from this group, send email to 
tiddlywiki+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tiddlywiki?hl=en.