Re: Mapping Field Text Ranges (was Re: Interprocess Communication (IPC) under OSX)

2017-12-28 Thread Paul Dupuis via use-livecode
Mark,

Thank you so much


On 12/28/2017 12:45 PM, Mark Waddingham via use-livecode wrote:
> On 2017-12-19 19:43, Mark Waddingham via use-livecode wrote:
>> I'm pretty sure it would be possible to write a handler which takes
>> the styledText array of a field in 6.7.11 and a list of old indicies,
>> returning a list of new char indicies... Would that help?
>
> Paul expressed an interest in how this might work - and he provided
> some more background:
>
> -*-
>
> Our main application, HyperRESEARCH, a tool for academics and others
> doing qualitative research, relies completely on chunk ranges. It is
> essentially a bookmarking tool where users can select some content from
> a document, the character position (chunk) is grabbed and the user gives
> it a text label and HyperRESEARCH remembers that label "Early Childhood
> Behavior X" points to char S to T of document "ABC". All documents,
> native text, unicode (utf8 or utf16), rtf, docx, odt, etc. are read into
> a LiveCode field, from which the selection is made and the chunk
> obtained. HyoperRESEARCH saves a "Study" file that contains a LOT of
> these labels and chunks and documents names.
>
> As part of our migration from LC464, which is what the current release
> of HyperRESEARCH is based on, we needed a way to convert a character
> range created under LC4.6.4 to a range under LC6.7.11 that point to the
> exact same string for the same file. Curry Kenworthy, whose libraries we
> license for reading MS-Word and Open Office documents built a library
> based on an algorithm I came up with to send the original LC464 ranges
> to a helper application using sockets or IPC. The helper application
> retrieves the strings associated with each chunk, strips white space and
> passes the string back to the LC6.7.11 version of the main app, which
> then finds the whitespace stripped strings in the same file loaded under
> LC6.7.11 with an indexing mechanism to adjust the positions for the
> stripped whitespace. It is a bit complicated, but it works reliably.
>
> -*-
>
> From this I infer the following:
>
> 1) The study file is a list of triples - label, char chunk, document
> filename
>
> 2) When using the study file, the original document is loaded into a
> field, and the char chunks are used to display labels which the user
> can jump to.
>
> 3) The char chunks are old-style (pre-5.5) byte indicies not codeunit
> indicies
>
> The crux of the problem Paul is having comes down to (3) which has
> some background to explain.
>
> Before 7.0, the field was the only part of the engine which naturally
> handled Unicode. In these older versions the field stored text as
> mixed sequence of style runs of either single bytes (native text) or
> double bytes (unicode text).
>
> Between 5.5 and 7.0, indicies used when referencing chars in fields
> corresponded to codeunits - this meant that the indicies were
> independent of the encoding of the runs in the field. In this case
> char N referred to the Nth codeunit in the field, whether up until
> that point was all unicode, all native or a mixture of both.
>
> Before 5.5, indicies used when referencing chars in fields
> corresponded to bytes - this meant that you had to take into account
> the encoding of the runs in the field. In this case, char N referred
> to the Nth byte in the field. So if your field had:
>
>  abcXYZabc (where XYZ are two byte unicode chars)
>
> Then char 4 would refer to the first byte of the X unicode char and
> *not* the two bytes it would have actually taken up.
>
> Now, importantly, the internal structure of the field did not change
> between 4.6.4 and 5.5, just how the 'char' chunk worked - in 6.7.11,
> the internal structure of the field is still the mixed runs of
> unicode/native bytes just as it was in 4.6.4 - the only difference is
> what happens if you reference char X to Y of the field.
>
> So solving this problem comes down to finding a means to 'get at' the
> internal encoding style runs of a field in 6.7.11. We want a handler:
>
>   mapByteRangeToCharRange(pFieldId, pByteFrom, pByteTo)
>
> Returning a pair pCharFrom, pCharTo - where pByteFrom, pByteTo are a
> char X to Y range from 4.6.4 and pCharFrom, pCharTo are a char X to Y
> range *for the same range* in 6.7.11.
>
> -*-
>
> Before going into the details, an easy way to see the internal mixed
> encoding of a field containing unicode in 6.7.11, is to put some text
> which is a mixture of native text and unicode text in a field and then
> look at its 'text' property. Putting:
>
> Лорем ипсум Lorem ipsum dolor sit amet, pr долор сит амет, вел татион
> игнота сцрибентур еи. Вих еа феугиат doctus necessitatibus ассентиор
> пхилосопхиа. Феугаитconsulatu disputando comprehensam  вивендум вис
> ет, мел еррем малорум ат. Хас но видерер лобортис, suscipit detraxit
> interesset eum аппетере инсоленс салутатус усу не. Еи дуо лудус
> яуаеяуе, ет елитр цорпора пер.
>
> Into a 6.7.11 field and then doing 'put the text of field 1' gives:
>
> ? 

Re: Mapping Field Text Ranges (was Re: Interprocess Communication (IPC) under OSX)

2017-12-28 Thread J. Landman Gay via use-livecode

On 12/28/17 2:42 PM, Mark Wieder via use-livecode wrote:

On 12/28/2017 09:45 AM, Mark Waddingham via use-livecode wrote:

3) The char chunks are old-style (pre-5.5) byte indicies not codeunit 
indicies


The crux of the problem Paul is having comes down to (3) which has 
some background to explain.


OMG! This is what Mr. Waddingham comes up with while on break!?



He can't help it, it's the only way his brain works... For which we are 
all grateful.


--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software   | http://www.hyperactivesw.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Mapping Field Text Ranges (was Re: Interprocess Communication (IPC) under OSX)

2017-12-28 Thread Mark Wieder via use-livecode

On 12/28/2017 09:45 AM, Mark Waddingham via use-livecode wrote:

3) The char chunks are old-style (pre-5.5) byte indicies not codeunit 
indicies


The crux of the problem Paul is having comes down to (3) which has some 
background to explain.


OMG! This is what Mr. Waddingham comes up with while on break!?

--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Mapping Field Text Ranges (was Re: Interprocess Communication (IPC) under OSX)

2017-12-28 Thread Mark Waddingham via use-livecode

On 2017-12-19 19:43, Mark Waddingham via use-livecode wrote:

I'm pretty sure it would be possible to write a handler which takes
the styledText array of a field in 6.7.11 and a list of old indicies,
returning a list of new char indicies... Would that help?


Paul expressed an interest in how this might work - and he provided some 
more background:


-*-

Our main application, HyperRESEARCH, a tool for academics and others
doing qualitative research, relies completely on chunk ranges. It is
essentially a bookmarking tool where users can select some content from
a document, the character position (chunk) is grabbed and the user gives
it a text label and HyperRESEARCH remembers that label "Early Childhood
Behavior X" points to char S to T of document "ABC". All documents,
native text, unicode (utf8 or utf16), rtf, docx, odt, etc. are read into
a LiveCode field, from which the selection is made and the chunk
obtained. HyoperRESEARCH saves a "Study" file that contains a LOT of
these labels and chunks and documents names.

As part of our migration from LC464, which is what the current release
of HyperRESEARCH is based on, we needed a way to convert a character
range created under LC4.6.4 to a range under LC6.7.11 that point to the
exact same string for the same file. Curry Kenworthy, whose libraries we
license for reading MS-Word and Open Office documents built a library
based on an algorithm I came up with to send the original LC464 ranges
to a helper application using sockets or IPC. The helper application
retrieves the strings associated with each chunk, strips white space and
passes the string back to the LC6.7.11 version of the main app, which
then finds the whitespace stripped strings in the same file loaded under
LC6.7.11 with an indexing mechanism to adjust the positions for the
stripped whitespace. It is a bit complicated, but it works reliably.

-*-

From this I infer the following:

1) The study file is a list of triples - label, char chunk, document 
filename


2) When using the study file, the original document is loaded into a 
field, and the char chunks are used to display labels which the user can 
jump to.


3) The char chunks are old-style (pre-5.5) byte indicies not codeunit 
indicies


The crux of the problem Paul is having comes down to (3) which has some 
background to explain.


Before 7.0, the field was the only part of the engine which naturally 
handled Unicode. In these older versions the field stored text as mixed 
sequence of style runs of either single bytes (native text) or double 
bytes (unicode text).


Between 5.5 and 7.0, indicies used when referencing chars in fields 
corresponded to codeunits - this meant that the indicies were 
independent of the encoding of the runs in the field. In this case char 
N referred to the Nth codeunit in the field, whether up until that point 
was all unicode, all native or a mixture of both.


Before 5.5, indicies used when referencing chars in fields corresponded 
to bytes - this meant that you had to take into account the encoding of 
the runs in the field. In this case, char N referred to the Nth byte in 
the field. So if your field had:


 abcXYZabc (where XYZ are two byte unicode chars)

Then char 4 would refer to the first byte of the X unicode char and 
*not* the two bytes it would have actually taken up.


Now, importantly, the internal structure of the field did not change 
between 4.6.4 and 5.5, just how the 'char' chunk worked - in 6.7.11, the 
internal structure of the field is still the mixed runs of 
unicode/native bytes just as it was in 4.6.4 - the only difference is 
what happens if you reference char X to Y of the field.


So solving this problem comes down to finding a means to 'get at' the 
internal encoding style runs of a field in 6.7.11. We want a handler:


  mapByteRangeToCharRange(pFieldId, pByteFrom, pByteTo)

Returning a pair pCharFrom, pCharTo - where pByteFrom, pByteTo are a 
char X to Y range from 4.6.4 and pCharFrom, pCharTo are a char X to Y 
range *for the same range* in 6.7.11.


-*-

Before going into the details, an easy way to see the internal mixed 
encoding of a field containing unicode in 6.7.11, is to put some text 
which is a mixture of native text and unicode text in a field and then 
look at its 'text' property. Putting:


Лорем ипсум Lorem ipsum dolor sit amet, pr долор сит амет, вел татион 
игнота сцрибентур еи. Вих еа феугиат doctus necessitatibus ассентиор 
пхилосопхиа. Феугаитconsulatu disputando comprehensam  вивендум вис ет, 
мел еррем малорум ат. Хас но видерер лобортис, suscipit detraxit 
interesset eum аппетере инсоленс салутатус усу не. Еи дуо лудус яуаеяуе, 
ет елитр цорпора пер.


Into a 6.7.11 field and then doing 'put the text of field 1' gives:

? ? Lorem ipsum dolor sit amet, pr ? ??? , ??? ?? 
?? ?? ??. ??? ?? ??? doctus necessitatibus ? 
???. ???consulatu disputando comprehensam   ??? ??, 
??? ? ??? ??. ??? 

Re: Interprocess Communication (IPC) under OSX

2017-12-19 Thread Mark Waddingham via use-livecode

On 2017-12-19 19:43, Mark Waddingham via use-livecode wrote:

On 2017-12-18 20:01, Paul Dupuis via use-livecode wrote:
In principle, the same code should work equally great under OSX, but 
it
does not. And yes, if (and when) I have time, i will track down the 
bugs

and report them, but at the moment I was hoping for a quick fix where
someone else already on the list may have identified the OSX
idiosyncrasies and had sample code that worked around them :-)


Hmmm - I have quite a long memory but it doesn't quite stretch as far
back as 4.6.4 these days so I can' say what might have changed on the
process related commands in versions since then.

I do know that there *were* significant bugs in process communication
on OS X for quite some time after I started working at LiveCode (well,
RunRev back then) - which gradually got fixed. I'm pretty sure that
the most recent versions (certainly since 6.x) should be working
correctly (- but I can't really say much about versions before then.

So I suspect the issues might be to do with the 4.6.4 side - rather
than the version on the other side (presumably 6.7.11?).


Of course, looking through the release notes for 4.6.4 I found this:



Slave process improvements (4.5)

A number of issues with the open process command and the engine itself 
have, up until now,
conspired to make it difficult (if not impossible!) to either run a 
slave process, or use the engine as

slave on all platforms.

These issues have been resolved in this version, thus making it 
straightforward to run another

process and poll for input and output over stdin/stdout.
The typical form for this is along the following lines (this example 
assumes the process being

executed outputs whole lines):

command startSlave pProcess
  open process pProcess for text update
  send “monitorSlave pProcess” to me in 50 millisecs
end startSlave

command monitorSlave pProcess
  repeat forever
# Loop until there are no more lines to read.
read from process pProcess for 1 line in 0 millisecs
if the result is empty then
  # The slave has sent us something, so process it and loop for
  # (potentially) more data.
else if the result is “timeout” then
  # There is nothing waiting for us, so exit repeat
  exit repeat
else if the result is “eof” then
  # The slave has terminated, so do any final processing and finish
  # monitoring.
  close process pProcess
  exit monitorSlave
else
  # Some error has occurred!
  exit monitorSlave
end if
  end repeat
  send “monitorSlave pProcess” to me in 50 millisecs
end monitorSlave



So, it would seem that the issues I refer to above were fixed in 4.5... 
So either other bugs have crept in, or there is a cross-platform 
difference lurking somewhere between mac/windows...


Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Interprocess Communication (IPC) under OSX

2017-12-19 Thread Mark Waddingham via use-livecode

On 2017-12-18 20:01, Paul Dupuis via use-livecode wrote:

In principle, the same code should work equally great under OSX, but it
does not. And yes, if (and when) I have time, i will track down the 
bugs

and report them, but at the moment I was hoping for a quick fix where
someone else already on the list may have identified the OSX
idiosyncrasies and had sample code that worked around them :-)


Hmmm - I have quite a long memory but it doesn't quite stretch as far 
back as 4.6.4 these days so I can' say what might have changed on the 
process related commands in versions since then.


I do know that there *were* significant bugs in process communication on 
OS X for quite some time after I started working at LiveCode (well, 
RunRev back then) - which gradually got fixed. I'm pretty sure that the 
most recent versions (certainly since 6.x) should be working correctly 
(- but I can't really say much about versions before then.


So I suspect the issues might be to do with the 4.6.4 side - rather than 
the version on the other side (presumably 6.7.11?).


In terms of the changes to field indicies - 6.x still uses the same 
unicode flag for style runs in the field that 4.6.4 uses - the data 
inside the engine is the same, but the char indicies are not.


I'm pretty sure it would be possible to write a handler which takes the 
styledText array of a field in 6.7.11 and a list of old indicies, 
returning a list of new char indicies... Would that help?


Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Interprocess Communication (IPC) under OSX

2017-12-18 Thread Richard Gaskin via use-livecode

Paul Dupuis wrote:

> On 12/18/2017 1:49 PM, Richard Gaskin via use-livecode wrote:
>> Which IPC method are you using?
>>
> Using LiveCode's open process (and related statements) vs open socket
> (IP) or using files.
>
> I have a set of stacks that work perfectly under Windows. The main app
> uses open process to spawn/launch the helper app standalone, which in
> turn listens for messages from the parent. The main app sens a message
> to the helper app, it does its thing and returns information to the
> main app (a series of these exchanges actually takes place) that then
> the main app senss a message for the helper exit. Check are built in
> for error or by checking teh open processes to see if the helper
> crashes or was quit (by a forced quit) and restart the helper is
> needed. As noted it work great.
>
> In principle, the same code should work equally great under OSX, but
> it does not. And yes, if (and when) I have time, i will track down the
> bugs and report them, but at the moment I was hoping for a quick fix
> where someone else already on the list may have identified the OSX
> idiosyncrasies and had sample code that worked around them :-)

If it's a bug in the engine no scripter can help.

But given how much of a Unix like macOS depends on stdin/stdout streams, 
I'd be surprised to see a regression in that engine.


Windows and Unix do handle processes differently under the hood, but if 
LC is doing what it ideally should be doing I agree we shouldn't need 
OS-forked code.


Diagnosing whether this is an engine issue or a platform difference 
would require review of the code for both processes.


--
 Richard Gaskin
 Fourth World Systems
 Software Design and Development for the Desktop, Mobile, and the Web
 
 ambassa...@fourthworld.comhttp://www.FourthWorld.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Interprocess Communication (IPC) under OSX

2017-12-18 Thread Paul Dupuis via use-livecode
On 12/18/2017 1:49 PM, Richard Gaskin via use-livecode wrote:
> Paul Dupuis wrote:
>
> > I am using IPC vs Sockets...
>
> "IPC" is usually a generic term, encompassing a wide range of
> inter-process communications methods which includes sockets, files,
> shared memory, pipes, and more.
>
> Which IPC method are you using?
>
Using LiveCode's open process (and related statements) vs open socket
(IP) or using files.

I have a set of stacks that work perfectly under Windows. The main app
uses open process to spawn/launch the helper app standalone, which in
turn listens for messages from the parent. The main app sens a message
to the helper app, it does its thing and returns information to the main
app (a series of these exchanges actually takes place) that then the
main app senss a message for the helper exit. Check are built in for
error or by checking teh open processes to see if the helper crashes or
was quit (by a forced quit) and restart the helper is needed. As noted
it work great.

In principle, the same code should work equally great under OSX, but it
does not. And yes, if (and when) I have time, i will track down the bugs
and report them, but at the moment I was hoping for a quick fix where
someone else already on the list may have identified the OSX
idiosyncrasies and had sample code that worked around them :-)




___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Interprocess Communication (IPC) under OSX

2017-12-18 Thread Richard Gaskin via use-livecode

Paul Dupuis wrote:

> I am using IPC vs Sockets...

"IPC" is usually a generic term, encompassing a wide range of 
inter-process communications methods which includes sockets, files, 
shared memory, pipes, and more.


Which IPC method are you using?

--
 Richard Gaskin
 Fourth World Systems
 Software Design and Development for the Desktop, Mobile, and the Web
 
 ambassa...@fourthworld.comhttp://www.FourthWorld.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode