Re: crash when using local display but not remote

2020-02-25 Thread Fred Kiefer



> Am 25.02.2020 um 17:25 schrieb Riccardo Mottola :
> 
> Fred Kiefer wrote:
>> I still don’t know for sure what is going on here. Most likely we are 
>> overriding some other data when copying the image. To prevent this I made 
>> the pre conditions of the new code explicit. If these are not fulfilled no 
>> app icon for WindowMaker will be created. I also changed the code to only 
>> create this icon for WindowMake, that way less systems will be affected by 
>> this issue. Could you please try again on your Letux?
> 
> I just tried again, now Ink starts up both locally & remotely!
> However, I get this warning when exporting to a "real color" display
> 
> 2020-02-25 16:36:17.620 Ink[2192:2192] Unsupported context depth 24
> 2020-02-25 16:36:18.236 Ink[2192:2192] XShm not supported, XShmAttach()
> failed.
> 2020-02-25 16:36:18.238 Ink[2192:2192] Falling back to normal XImage
> (will be slower).
> 
> 
> I suppose XShm fails because it is over ssh? But what is the first message?

That message is just what I added to prevent the issue. Instead of trying to 
copy over 32 bits per pixel into a buffer that is way smaller it now outputs 
this message and does not create a application icon for WindowMaker. If this 
message bothers you I can turn it into a debug message.

> Locally, instead, I only get
> Unsupported context depth 16
> 
> So... the depth is correct 16 and 24, but why unsupported?

Yes, this is a result of the app icon code change by Sergii, this new code path 
will only work for 32 bit colour information. And no, we are not going to 
revert this change.


Re: crash when using local display but not remote

2020-02-25 Thread Riccardo Mottola
Hi!

Fred Kiefer wrote:
> I still don’t know for sure what is going on here. Most likely we are 
> overriding some other data when copying the image. To prevent this I made the 
> pre conditions of the new code explicit. If these are not fulfilled no app 
> icon for WindowMaker will be created. I also changed the code to only create 
> this icon for WindowMake, that way less systems will be affected by this 
> issue. Could you please try again on your Letux?

I just tried again, now Ink starts up both locally & remotely!
However, I get this warning when exporting to a "real color" display

2020-02-25 16:36:17.620 Ink[2192:2192] Unsupported context depth 24
2020-02-25 16:36:18.236 Ink[2192:2192] XShm not supported, XShmAttach()
failed.
2020-02-25 16:36:18.238 Ink[2192:2192] Falling back to normal XImage
(will be slower).


I suppose XShm fails because it is over ssh? But what is the first message?

Locally, instead, I only get
Unsupported context depth 16

So... the depth is correct 16 and 24, but why unsupported?

On the same computer which gave the 24bit depth warning and I was
exporting the display to, if I run instead GNUstep and Ink locally, I see:
2020-02-25 16:02:43.318 Ink[24686:24686] No local time zone specified.
2020-02-25 16:02:43.319 Ink[24686:24686] Using time zone with absolute
offset 0.
2020-02-25 16:02:43.252 Ink[24686:24686] styleoffsets ... guessing offsets
2020-02-25 16:02:43.320 Ink[24686:24686] styleoffsets ... guessing offsets
2020-02-25 16:02:49.517 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.
2020-02-25 16:02:50.744 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.
2020-02-25 16:02:50.998 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.
2020-02-25 16:02:51.071 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.
2020-02-25 16:02:51.072 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.
2020-02-25 16:02:51.646 Ink[24686:24686] Ignore left offset change from
0 to 31
2020-02-25 16:02:51.646 Ink[24686:24686] Ignore right offset change from
0 to -31
2020-02-25 16:02:51.647 Ink[24686:24686] Ignore top offset change from 0
to 31
2020-02-25 16:02:51.647 Ink[24686:24686] Ignore bottom offset change
from 0 to -31
2020-02-25 16:02:51.647 Ink[24686:24686] Reparent was with offset 31 31
2020-02-25 16:02:51.647 Ink[24686:24686] Parent border,width,height 0,64,64
2020-02-25 16:02:51.977 Ink[24686:24686] The font specified for NSFont,
FreeSans, can't be found.



Riccardo



Re: crash when using local display but not remote

2020-02-21 Thread Fred Kiefer



> Am 19.02.2020 um 00:37 schrieb Riccardo Mottola :
> 
> Fred Kiefer wrote:
>> You won’t have to look up the WINGs documentation. In most cases we do not 
>> use a separate Wraster library but have copies of the files in this 
>> directory. The one you are looking for is util.c, but first check your 
>> configuration look whether the Wraster library gets used or our local files. 
>> It could well be that there is an issue with that code. Or the opposite may 
>> be true and you have one of the rare cases where Wraster is available but 
>> faulty. But that is highly unlikely.
> 
> I guess it is not used/detected. In config.log:
> 
> WITH_WRASTER='no'
> 
> as the output LIBS:
> 
> LIBS='-L/usr/lib -lart_lgpl_2 -lm -lfreetype -lz-lXt -lXext -lX11  '
> 
> given that I have:
> #define XSHM 1
> 
> that is the path taken.
> 
> As a final test, I commented out "free" inside
> 
> RDestroyXImage
> 
> And it does not avoid the issue! The issue instead appears to be the
> XDestroyImage call, commenting out that, I get no crash.
> 
> As I check, the "non" shared path is taken.
> RDestroyXImage rx->image is not shared!
> 
> What's going on?

I still don’t know for sure what is going on here. Most likely we are 
overriding some other data when copying the image. To prevent this I made the 
pre conditions of the new code explicit. If these are not fulfilled no app icon 
for WindowMaker will be created. I also changed the code to only create this 
icon for WindowMake, that way less systems will be affected by this issue. 
Could you please try again on your Letux?


Re: crash when using local display but not remote

2020-02-18 Thread Riccardo Mottola
Hoi Fred!

Fred Kiefer wrote:
> You won’t have to look up the WINGs documentation. In most cases we do not 
> use a separate Wraster library but have copies of the files in this 
> directory. The one you are looking for is util.c, but first check your 
> configuration look whether the Wraster library gets used or our local files. 
> It could well be that there is an issue with that code. Or the opposite may 
> be true and you have one of the rare cases where Wraster is available but 
> faulty. But that is highly unlikely.

I guess it is not used/detected. In config.log:

WITH_WRASTER='no'

as the output LIBS:

LIBS='-L/usr/lib -lart_lgpl_2 -lm -lfreetype -lz    -lXt -lXext -lX11  '

given that I have:
#define XSHM 1

that is the path taken.

As a final test, I commented out "free" inside

RDestroyXImage

And it does not avoid the issue! The issue instead appears to be the
XDestroyImage call, commenting out that, I get no crash.

As I check, the "non" shared path is taken.
RDestroyXImage rx->image is not shared!

What's going on?

Riccardo




Re: crash when using local display but not remote

2020-02-18 Thread Fred Kiefer



> Am 18.02.2020 um 21:59 schrieb Riccardo Mottola :
> 
> Fred Kiefer wrote:
>> 
>> You may have to set these separately. I was hoping there was a way to 
>> specify and array here, but did not check. So the easiest was is
>> 
>> Ink —GNU-Debug=Dflt —GNU-Debug=XGTrace —GNU-Debug=Frame
> 
> Oh that finally helps, it is actually --GNU-debug=xxx (two dashes)
> 
> I get this:
> 
> root@hobbit:~# Ink --GNU-Debug=Dflt --GNU-Debug=XGTrace --GNU-Debug=Frame
> 2009-12-28 22:54:25.191 Ink[312:312] WindowMaker hack: Preparing app
> icon window
> 2009-12-28 22:54:25.297 Ink[312:312] DPSwindow: {x = 0; y = 0; width =
> 0; height = 0} 2
> 2009-12-28 22:54:25.306 Ink[312:312] Draw mech 1 for screen 0
> 2009-12-28 22:54:25.310 Ink[312:312] O2X 0, 40, {x = 0; y = 0; width =
> 0; height = 0}, {x = 0; y = 480; width = 0; height = 0}
> 2009-12-28 22:54:25.315 Ink[312:312] X2H 0, 40, {x = 0; y = 480; width =
> 2; height = 2}, {x = 0; y = 480; width = 2; height = 2}
> Xlib:  extension "SYNC" missing on display ":0.0".
> 2009-12-28 22:54:26.003 Ink[312:312] Hint posn 1: 0, 480
> 2009-12-28 22:54:26.005 Ink[312:312] Hint size 1: 2, 2
> *** glibc detected *** double free or corruption (out): 0x006415f0 ***
> Aborted
> 
> 
> The values look fine for me.
> 
> I started putting in some logs, then more logs and even more logs :-O
> 
> I am sure the issue is happening in here:
>   if (!didCreatePixmaps)
> {
>   [self _createAppIconPixmaps];
> }
> 
> I heavily log-traced _createAppIconPixmaps too
> 
> The crash precisly happens with the line:
>   RDestroyXImage(rcontext, rxImage);
> 
> So for some the new code here makes a double-free. I wonder if it is a
> good idea at all to use a WINGs function here at all when before we did not.
> I commented the Destroy out and things do work now! But I don't think
> this is the correct solution.
> 
> I wanted to look up some WINGs documentation and check, but appears it
> disappeared into net oblivion?

You won’t have to look up the WINGs documentation. In most cases we do not use 
a separate Wraster library but have copies of the files in this directory. The 
one you are looking for is util.c, but first check your configuration look 
whether the Wraster library gets used or our local files. It could well be that 
there is an issue with that code. Or the opposite may be true and you have one 
of the rare cases where Wraster is available but faulty. But that is highly 
unlikely.


Re: crash when using local display but not remote

2020-02-18 Thread Riccardo Mottola
Fred!

Fred Kiefer wrote:
>
> You may have to set these separately. I was hoping there was a way to specify 
> and array here, but did not check. So the easiest was is
>
> Ink —GNU-Debug=Dflt —GNU-Debug=XGTrace —GNU-Debug=Frame

Oh that finally helps, it is actually --GNU-debug=xxx (two dashes)

I get this:

root@hobbit:~# Ink --GNU-Debug=Dflt --GNU-Debug=XGTrace --GNU-Debug=Frame
2009-12-28 22:54:25.191 Ink[312:312] WindowMaker hack: Preparing app
icon window
2009-12-28 22:54:25.297 Ink[312:312] DPSwindow: {x = 0; y = 0; width =
0; height = 0} 2
2009-12-28 22:54:25.306 Ink[312:312] Draw mech 1 for screen 0
2009-12-28 22:54:25.310 Ink[312:312] O2X 0, 40, {x = 0; y = 0; width =
0; height = 0}, {x = 0; y = 480; width = 0; height = 0}
2009-12-28 22:54:25.315 Ink[312:312] X2H 0, 40, {x = 0; y = 480; width =
2; height = 2}, {x = 0; y = 480; width = 2; height = 2}
Xlib:  extension "SYNC" missing on display ":0.0".
2009-12-28 22:54:26.003 Ink[312:312] Hint posn 1: 0, 480
2009-12-28 22:54:26.005 Ink[312:312] Hint size 1: 2, 2
*** glibc detected *** double free or corruption (out): 0x006415f0 ***
Aborted


The values look fine for me.

I started putting in some logs, then more logs and even more logs :-O

I am sure the issue is happening in here:
  if (!didCreatePixmaps)
    {
  [self _createAppIconPixmaps];
    }

I heavily log-traced _createAppIconPixmaps too

The crash precisly happens with the line:
  RDestroyXImage(rcontext, rxImage);

So for some the new code here makes a double-free. I wonder if it is a
good idea at all to use a WINGs function here at all when before we did not.
I commented the Destroy out and things do work now! But I don't think
this is the correct solution.

I wanted to look up some WINGs documentation and check, but appears it
disappeared into net oblivion?


Riccardo



Re: crash when using local display but not remote

2020-02-17 Thread Fred Kiefer



> Am 17.02.2020 um 23:53 schrieb Riccardo Mottola :
> 
> Fred Kiefer wrote:
>> 
>> 
>> With all the latest changes could you please try again and report the stack 
>> trace you are getting? It would also help if you could find out which code 
>> is crashing. For this you could start by getting the normal debug output 
>> from the backend (—GNU-Debug=Dflt,XGTrace,Frame) and after you have the 
>> general region just add a few NSLog statements of your own.
> 
> How do you exactly use that? I tried both:
> 
> defaults write GNU-Debug {Dflt,XGTrace,Frame}
> 
> as well as:
> 
> Ink --GNU-Debug=Dflt,XGTrace,Frame


You may have to set these separately. I was hoping there was a way to specify 
and array here, but did not check. So the easiest was is

Ink —GNU-Debug=Dflt —GNU-Debug=XGTrace —GNU-Debug=Frame

>> The original message you posted (glibc detected *** double free or 
>> corruption (out)) points to a free call. We have plenty of these, but this 
>> might be a hint when you are getting closer.
> 
> I hope so. building with "debug=yes" and running in gdb makes an
> incredible slow run, but no better trace:
> 
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 16384 (LWP 12897)]
> 0x2b898a94 in kill () from /lib/libc.so.6
> (gdb) bt
> #0  0x2b898a94 in kill () from /lib/libc.so.6
> #1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
> #2  0x2b674c00 in raise () from /lib/libpthread.so.0
> #3  0x2b89a190 in abort () from /lib/libc.so.6
> #4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
> #5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
> Previous frame identical to this frame (corrupt stack?)

This does not really help.





Re: crash when using local display but not remote

2020-02-17 Thread Riccardo Mottola
Hi Fred,

Fred Kiefer wrote:
> It is rather the later and that would also only change for the application 
> icon. We should instead concentrate on the crash you are having on the Letux. 
> You wrote that the version from  September 14th works fine. Does this version 
> also show the message „Xlib:  extension "SYNC" missing on display ":0.0“.“? 
> If it does we may ignore this hint, otherwise it might be worthwhile to 
> follow.

I extra checked once again by compiling the old version and installing it.
I get several messagies for the SYNC message (I think one for every
window at least) and everything works. So we should ignore that, at
least, for the crash it is invariant.

>
> With all the latest changes could you please try again and report the stack 
> trace you are getting? It would also help if you could find out which code is 
> crashing. For this you could start by getting the normal debug output from 
> the backend (—GNU-Debug=Dflt,XGTrace,Frame) and after you have the general 
> region just add a few NSLog statements of your own.

How do you exactly use that? I tried both:

defaults write GNU-Debug {Dflt,XGTrace,Frame}

as well as:

Ink --GNU-Debug=Dflt,XGTrace,Frame


but nothing useful gets printed out, just the crash.

Do I need to recompile something with some specific option?

> The original message you posted (glibc detected *** double free or corruption 
> (out)) points to a free call. We have plenty of these, but this might be a 
> hint when you are getting closer.

I hope so. building with "debug=yes" and running in gdb makes an
incredible slow run, but no better trace:

Program received signal SIGABRT, Aborted.
[Switching to Thread 16384 (LWP 12897)]
0x2b898a94 in kill () from /lib/libc.so.6
(gdb) bt
#0  0x2b898a94 in kill () from /lib/libc.so.6
#1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
#2  0x2b674c00 in raise () from /lib/libpthread.so.0
#3  0x2b89a190 in abort () from /lib/libc.so.6
#4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
#5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
Previous frame identical to this frame (corrupt stack?)

Riccardo



Re: crash when using local display but not remote

2020-02-08 Thread Fred Kiefer


> Am 08.02.2020 um 12:50 schrieb Riccardo Mottola :
> 
> On 07/02/2020 10:20, Fred Kiefer wrote:
>> I just committed a merge of the two approaches (plus a few compiler warning 
>> fixes). I hope this should fix Riccardo's issues. It might also be slightly 
>> faster as I removed the copy before the colour swap. At least the swapColor 
>> function now does what the comment above it claims:-)
> 
> Thanks for the work and analysis. Unfortunately it still doesn't work for me.
> 
> I grabbed latest trunk it contains your commit plus a later merge+commit by 
> Sergii.
> 
> Maybe it has to do with the fact that this is not a "DirectColor" display but 
> a "TrueColor" display? I remember reading a comment by you that indexed color 
> would not work.
> 
> This small display is TrueColor 16 planes.
> 
> Or does indexedColor for you means old 256 color - 8 bit? I hope we did not 
> break that too, for old machines :-P

It is rather the later and that would also only change for the application 
icon. We should instead concentrate on the crash you are having on the Letux. 
You wrote that the version from  September 14th works fine. Does this version 
also show the message „Xlib:  extension "SYNC" missing on display ":0.0“.“? If 
it does we may ignore this hint, otherwise it might be worthwhile to follow.

With all the latest changes could you please try again and report the stack 
trace you are getting? It would also help if you could find out which code is 
crashing. For this you could start by getting the normal debug output from the 
backend (—GNU-Debug=Dflt,XGTrace,Frame) and after you have the general region 
just add a few NSLog statements of your own.

The original message you posted (glibc detected *** double free or corruption 
(out)) points to a free call. We have plenty of these, but this might be a hint 
when you are getting closer.

Fred






Re: crash when using local display but not remote

2020-02-08 Thread Riccardo Mottola

Hi Fred!

On 07/02/2020 10:20, Fred Kiefer wrote:

I just committed a merge of the two approaches (plus a few compiler warning 
fixes). I hope this should fix Riccardo's issues. It might also be slightly 
faster as I removed the copy before the colour swap. At least the swapColor 
function now does what the comment above it claims:-)


Thanks for the work and analysis. Unfortunately it still doesn't work 
for me.


I grabbed latest trunk it contains your commit plus a later merge+commit 
by Sergii.


Maybe it has to do with the fact that this is not a "DirectColor" 
display but a "TrueColor" display? I remember reading a comment by you 
that indexed color would not work.


This small display is TrueColor 16 planes.

Or does indexedColor for you means old 256 color - 8 bit? I hope we did 
not break that too, for old machines :-P



RIccardo




Re: crash when using local display but not remote

2020-02-08 Thread Riccardo Mottola

Hi,

On 04/02/2020 22:15, Sergii Stoian wrote:
Clear explanation, thank you. It is a usual case when image contains 
more bytes than image size (width * height)?

I feel we need to return back old code with your explanation in comment.
What do you think?



I don't know about "this" specific image case, but generally speaking yes.

Cocoa does this now (since 10.6 at least) quite aggressively, so "bytes 
per row" is not necessary width * samplesPerPixel. In GNUstep i have not 
noticed that.


So I had to fix quite some image processing code (e.g. PRICE, 
LaternaMagica) which worked fine in GNUstep and on 10.4 Mac. I guess it 
is done for optimization, address alignment for various reasons.



Riccardo




Re: crash when using local display but not remote

2020-02-07 Thread Fred Kiefer



> Am 04.02.2020 um 22:15 schrieb Sergii Stoian :
> 
>> On Feb 4, 2020, at 20:43, Fred Kiefer  wrote:
>> In general it is safer as the new code expects that the image is fully 
>> packed. (You moved the comment with the conversion from unpacked to packed 
>> over to the swap function) If bytesPerRow is not equal to w * 4 (there may 
>> be a few extra bytes to align stuff a bit), then the new code would not 
>> transfer the correct data.  We would end up with random garbage in between. 
>> But in this special case the image comes from GSStandardImage and at least 
>> for the case where there are alpha values that function should already 
>> return a packed image. Thinking about it the old code should only have 
>> copied w * 4 bytes for each row. The old code could have written a few bytes 
>> past the pixels array.
> 
> Clear explanation, thank you. It is a usual case when image contains more 
> bytes than image size (width * height)?
> I feel we need to return back old code with your explanation in comment.
> What do you think?

I just committed a merge of the two approaches (plus a few compiler warning 
fixes). I hope this should fix Riccardo's issues. It might also be slightly 
faster as I removed the copy before the colour swap. At least the swapColor 
function now does what the comment above it claims :-)

Fred


Re: crash when using local display but not remote

2020-02-04 Thread Sergii Stoian

> On Feb 4, 2020, at 20:43, Fred Kiefer  wrote:
> 
> 
> 
>> Am 04.02.2020 um 11:21 schrieb Sergii Stoian :
>> 
>> 
>> On Mon, Feb 3, 2020 at 8:59 AM Fred Kiefer  wrote:
>> 
>>> Am 03.02.2020 um 00:53 schrieb Sergii Stoian :
>>> 
>>> On Mon, Feb 3, 2020 at 1:05 AM Fred Kiefer  wrote:
>>> 
>>> I just ran a quick scan with valgrind and this did not detect any obvious 
>>> wrong memory access. Looking at the code once again I see that line 4276 
>>> may be wrong for certain bytesPerRow values. Here the old code that copied 
>>> over line by line is safer. Maybe we could check bytesPerRow versus 
>>> pixelsWide*4 and use the old code if they are not the same?
>>> 
>>> Line 4276 looks like this: "xcursorImage->yhot = hotp.y;" Do you mean 
>>> memcpy call at 4279?
>> 
>> Yes, it was line 4276 in the original merge commit, but has changed since 
>> then.
>> 
>> Could you please explain why old code is safer?
>> 
>> Old code:
>> for (row = 0; row < h; row++)
>>  {
>>memcpy((char*)xcursorImage->pixels + (row * (w * 4)),
>>   data + (row * bytesPerRow),
>>   bytesPerRow);
>>  }
>> 
>> New code:
>> 
>> memcpy((char*)xcursorImage->pixels, data, w * h * colors);
> 
> 
> In general it is safer as the new code expects that the image is fully 
> packed. (You moved the comment with the conversion from unpacked to packed 
> over to the swap function) If bytesPerRow is not equal to w * 4 (there may be 
> a few extra bytes to align stuff a bit), then the new code would not transfer 
> the correct data.  We would end up with random garbage in between. But in 
> this special case the image comes from GSStandardImage and at least for the 
> case where there are alpha values that function should already return a 
> packed image. Thinking about it the old code should only have copied w * 4 
> bytes for each row. The old code could have written a few bytes past the 
> pixels array.

Clear explanation, thank you. It is a usual case when image contains more bytes 
than image size (width * height)?
I feel we need to return back old code with your explanation in comment.
What do you think?

Sergii

Re: crash when using local display but not remote

2020-02-04 Thread Fred Kiefer



> Am 04.02.2020 um 11:21 schrieb Sergii Stoian :
> 
> 
> On Mon, Feb 3, 2020 at 8:59 AM Fred Kiefer  wrote:
> 
> > Am 03.02.2020 um 00:53 schrieb Sergii Stoian :
> > 
> > On Mon, Feb 3, 2020 at 1:05 AM Fred Kiefer  wrote:
> > 
> > I just ran a quick scan with valgrind and this did not detect any obvious 
> > wrong memory access. Looking at the code once again I see that line 4276 
> > may be wrong for certain bytesPerRow values. Here the old code that copied 
> > over line by line is safer. Maybe we could check bytesPerRow versus 
> > pixelsWide*4 and use the old code if they are not the same?
> > 
> > Line 4276 looks like this: "xcursorImage->yhot = hotp.y;" Do you mean 
> > memcpy call at 4279?
> 
> Yes, it was line 4276 in the original merge commit, but has changed since 
> then.
> 
> Could you please explain why old code is safer?
> 
> Old code:
> for (row = 0; row < h; row++)
>   {
> memcpy((char*)xcursorImage->pixels + (row * (w * 4)),
>data + (row * bytesPerRow),
>bytesPerRow);
>   }
> 
> New code:
> 
> memcpy((char*)xcursorImage->pixels, data, w * h * colors);


In general it is safer as the new code expects that the image is fully packed. 
(You moved the comment with the conversion from unpacked to packed over to the 
swap function) If bytesPerRow is not equal to w * 4 (there may be a few extra 
bytes to align stuff a bit), then the new code would not transfer the correct 
data.  We would end up with random garbage in between. But in this special case 
the image comes from GSStandardImage and at least for the case where there are 
alpha values that function should already return a packed image. Thinking about 
it the old code should only have copied w * 4 bytes for each row. The old code 
could have written a few bytes past the pixels array.




Re: crash when using local display but not remote

2020-02-04 Thread Sergii Stoian
On Mon, Feb 3, 2020 at 8:59 AM Fred Kiefer  wrote:

>
> > Am 03.02.2020 um 00:53 schrieb Sergii Stoian :
> >
> > On Mon, Feb 3, 2020 at 1:05 AM Fred Kiefer  wrote:
> >
> > I just ran a quick scan with valgrind and this did not detect any
> obvious wrong memory access. Looking at the code once again I see that line
> 4276 may be wrong for certain bytesPerRow values. Here the old code that
> copied over line by line is safer. Maybe we could check bytesPerRow versus
> pixelsWide*4 and use the old code if they are not the same?
> >
> > Line 4276 looks like this: "xcursorImage->yhot = hotp.y;" Do you mean
> memcpy call at 4279?
>
> Yes, it was line 4276 in the original merge commit, but has changed since
> then.
>

Could you please explain why old code is safer?

Old code:
for (row = 0; row < h; row++)
  {
memcpy((char*)xcursorImage->pixels + (row * (w * 4)),
   data + (row * bytesPerRow),
   bytesPerRow);
  }

New code:

memcpy((char*)xcursorImage->pixels, data, w * h * colors);

-- 
Sergii Stoian


Re: crash when using local display but not remote

2020-02-02 Thread Fred Kiefer


> Am 03.02.2020 um 00:53 schrieb Sergii Stoian :
> 
> On Mon, Feb 3, 2020 at 1:05 AM Fred Kiefer  wrote:
> 
> I just ran a quick scan with valgrind and this did not detect any obvious 
> wrong memory access. Looking at the code once again I see that line 4276 may 
> be wrong for certain bytesPerRow values. Here the old code that copied over 
> line by line is safer. Maybe we could check bytesPerRow versus pixelsWide*4 
> and use the old code if they are not the same?
> 
> Line 4276 looks like this: "xcursorImage->yhot = hotp.y;" Do you mean memcpy 
> call at 4279?

Yes, it was line 4276 in the original merge commit, but has changed since then.




Re: crash when using local display but not remote

2020-02-02 Thread Sergii Stoian
Fred,

On Mon, Feb 3, 2020 at 1:05 AM Fred Kiefer  wrote:

>
> > Am 02.02.2020 um 21:38 schrieb Riccardo Mottola <
> riccardo.mott...@libero.it>:
> >
> > On 1/28/20 11:28 AM, Sergii Stoian wrote:
> >>
> >> I'm not sure, just an idea: this problem may have relation to enabled
> multithreading in X11. Probably due to outdated X server.
> >> Could you please try to comment out line in x11/XGServer.m that
> contains XInitThreads() (line 419) and recompile/reinstall backend?
> >
> >
> > I was able to restrict the offending breakage.
> >
> > As of 14 September (version bump) everything worked fine on the Letux400
> MIPS-LE
> >
> > As of 13 January it is already broken
> >
> > As of 14 January it is still broken giving the memory error on startup.
> (I include obviously the minor ALPHA_THRESHOLD fix)
> >
> >
> > I'm a little bit confused with the commits of 13th and 14th January,
> since they seem to contain similar things!
>
> These are all different commits that belong to the same pull request and
> at the end the branch gets merged. What you see are the single commits plus
> the final merge.
>
> > Somehow, however in the "fixes" for the icon there appears to be a
> memory issue!
>
> This is really hard to tell. Do you have a stack trace or any other
> analysis of the issue? Perhaps scattering log statements in the changed
> functions might help to narrow it down a bit.
>
> I just ran a quick scan with valgrind and this did not detect any obvious
> wrong memory access. Looking at the code once again I see that line 4276
> may be wrong for certain bytesPerRow values. Here the old code that copied
> over line by line is safer. Maybe we could check bytesPerRow versus
> pixelsWide*4 and use the old code if they are not the same?
>

Line 4276 looks like this: "xcursorImage->yhot = hotp.y;" Do you mean
memcpy call at 4279?


> But there are also other possible causes. If your old Letux uses indexed
> colours the old code for _createAppIconPixmaps would be required.
>

-- 
Sergii Stoian


Re: crash when using local display but not remote

2020-02-02 Thread Fred Kiefer


> Am 02.02.2020 um 21:38 schrieb Riccardo Mottola :
> 
> On 1/28/20 11:28 AM, Sergii Stoian wrote:
>> 
>> I'm not sure, just an idea: this problem may have relation to enabled 
>> multithreading in X11. Probably due to outdated X server.
>> Could you please try to comment out line in x11/XGServer.m that contains 
>> XInitThreads() (line 419) and recompile/reinstall backend?
> 
> 
> I was able to restrict the offending breakage.
> 
> As of 14 September (version bump) everything worked fine on the Letux400 
> MIPS-LE
> 
> As of 13 January it is already broken
> 
> As of 14 January it is still broken giving the memory error on startup. (I 
> include obviously the minor ALPHA_THRESHOLD fix)
> 
> 
> I'm a little bit confused with the commits of 13th and 14th January, since 
> they seem to contain similar things!

These are all different commits that belong to the same pull request and at the 
end the branch gets merged. What you see are the single commits plus the final 
merge.

> Somehow, however in the "fixes" for the icon there appears to be a memory 
> issue!

This is really hard to tell. Do you have a stack trace or any other analysis of 
the issue? Perhaps scattering log statements in the changed functions might 
help to narrow it down a bit.

I just ran a quick scan with valgrind and this did not detect any obvious wrong 
memory access. Looking at the code once again I see that line 4276 may be wrong 
for certain bytesPerRow values. Here the old code that copied over line by line 
is safer. Maybe we could check bytesPerRow versus pixelsWide*4 and use the old 
code if they are not the same?

But there are also other possible causes. If your old Letux uses indexed 
colours the old code for _createAppIconPixmaps would be required.
 


Re: crash when using local display but not remote

2020-02-02 Thread Riccardo Mottola

Hi!


On 1/28/20 11:28 AM, Sergii Stoian wrote:


I'm not sure, just an idea: this problem may have relation to enabled 
multithreading in X11. Probably due to outdated X server.
Could you please try to comment out line in x11/XGServer.m that 
contains XInitThreads() (line 419) and recompile/reinstall backend?



I was able to restrict the offending breakage.

As of 14 September (version bump) everything worked fine on the Letux400 
MIPS-LE


As of 13 January it is already broken

As of 14 January it is still broken giving the memory error on startup. 
(I include obviously the minor ALPHA_THRESHOLD fix)



I'm a little bit confused with the commits of 13th and 14th January, 
since they seem to contain similar things!



Somehow, however in the "fixes" for the icon there appears to be a 
memory issue!



Riccardo




Re: crash when using local display but not remote

2020-02-02 Thread Riccardo Mottola

Hi!

On 1/28/20 11:28 AM, Sergii Stoian wrote:


I'm not sure, just an idea: this problem may have relation to enabled 
multithreading in X11. Probably due to outdated X server.
Could you please try to comment out line in x11/XGServer.m that 
contains XInitThreads() (line 419) and recompile/reinstall backend?



I tried and it does not help.


Now... the best would be to backout some changes and test, but without 
direct git/svn access it is quite bad.



I'll see if I can nevertheless understand this better.


Riccardo




Re: crash when using local display but not remote

2020-01-28 Thread Sergii Stoian
Hi,

On Tue, Jan 28, 2020 at 9:53 AM Riccardo Mottola 
wrote:

> Hi!
>
> Sergii Stoian wrote:
> >> I’ve installed art on debian buster x64 with the gnustep runtime (2.0)
> from latest source on github.com/gnustep (does it answer your PS, Sergii
> ?)
> > It depends on what version of Debian installed on Riccardo's MIPS
> machine...
> >
>
> I have debian 4.0 and am limited to a 2.4 kernel. The rest is some sort
> of a mix.
> gcc 4.1.2 and its own runtime
>
> I don't think this is an "art" bug, but an xlib/x11 issue, since it
> happened also when I compiled accidentally with the xlib instead of the
> art backend.
>
> For your informatin, I just compiled the art backend again on this small
> machine, with all your latest patches after the merge, and it compiles
> and also works fine if remote X is displayed.
> I usuallky always export display, because the small keyboard is
> unsuitable for typing  the build commands :-P
>
> Riccardo
>

I'm not sure, just an idea: this problem may have relation to enabled
multithreading in X11. Probably due to outdated X server.
Could you please try to comment out line in x11/XGServer.m that contains
XInitThreads() (line 419) and recompile/reinstall backend?

Sergii


Re: crash when using local display but not remote

2020-01-27 Thread Riccardo Mottola
Hi!

Sergii Stoian wrote:
>> I’ve installed art on debian buster x64 with the gnustep runtime (2.0) from 
>> latest source on github.com/gnustep (does it answer your PS, Sergii ?)
> It depends on what version of Debian installed on Riccardo's MIPS machine...
>

I have debian 4.0 and am limited to a 2.4 kernel. The rest is some sort
of a mix.
gcc 4.1.2 and its own runtime

I don't think this is an "art" bug, but an xlib/x11 issue, since it
happened also when I compiled accidentally with the xlib instead of the
art backend.

For your informatin, I just compiled the art backend again on this small
machine, with all your latest patches after the merge, and it compiles
and also works fine if remote X is displayed.
I usuallky always export display, because the small keyboard is
unsuitable for typing  the build commands :-P

Riccardo



Re: crash when using local display but not remote

2020-01-26 Thread Riccardo Mottola

Hi,

On 1/26/20 1:49 AM, Sergii Stoian wrote:

Hi Riccardo !
I’ve installed art on debian buster x64 with the gnustep runtime (2.0) from 
latest source on github.com/gnustep (does it answer your PS, Sergii ?)

It depends on what version of Debian installed on Riccardo's MIPS machine...



no, it is too generic. It is anyway good news that the art backend runs 
fine for you too! Did you notice the superior quality in rendering fonts 
and splines in Graphos? :-P


On i386 with gcc and gcc runtime (but a totally different OS setup) 
everything works for me too! tried a couple of apps and everything looks 
smooth & quick.



Even if it runs, a memory issue (which is what the libc error suggests) 
could go unnoticed and valgrind should be able do detect it. I was never 
good at using it though, Fred is the master


It is hard to work on that minimal machine. If I had a working CI20, it 
would be much easier!


Especially the old TLS means that I cannot sync directly to github :(

Riccardo



Re: crash when using local display but not remote

2020-01-26 Thread Riccardo Mottola

HI Sergii


On 1/25/20 12:13 AM, cobjective wrote:

Give more information, please.
Why do you think it’s art backend? Did you try other backends/compiler/linker?
What are your build options (runtime, library combo, compiler, linker)?

P.S.: Does it mean you can build art backend without ftfont-old.m (with my 
changes to art)?



yes it means that the art backend builds without ftfont-old.m - so I 
just approved your changes


Since everything works /displays when the display is exported to another 
computer, it is a strange issue, but I guess not font-related, since art 
backend always uses the "original" client font, not the server's fonts.


The Debian version used is old... and is even a small "mix" of packages 
made to work years ago on that minimal netbook, upgrading now is due to 
kernel issues (although maybe Nikolaus makes progress, thanks to the 
CI20 board!) impossible.


The runtime is gcc's one and it is some gcc 4 series, cpu is MIPS 
little-endian.. so I doubt you have an "i386" machine with that stuff.


I will perform some further tests


Riccardo




Re: crash when using local display but not remote

2020-01-25 Thread Sergii Stoian
Hi Bertrand,

> On Jan 26, 2020, at 02:09, Bertrand Dekoninck  
> wrote:
> 
> Hi Riccardo !
> I’ve installed art on debian buster x64 with the gnustep runtime (2.0) from 
> latest source on github.com/gnustep (does it answer your PS, Sergii ?)

It depends on what version of Debian installed on Riccardo's MIPS machine...

> I did’n’t try Ink, but Gworkspace works fine (well not its inspectors, of 
> course) locally. I don’t see your error, Ricardo. Has this error something to 
> do with the "Sync extension «  not present on your system xlib ?
> 
> What should I do with valgrind ?
> 
> Bertrand
> 
>> Le 25 janv. 2020 à 00:13, cobjective  a écrit :
>> 
>> Hi, Riccardo!
>> 
>>> On Jan 24, 2020, at 00:51, Riccardo Mottola  
>>> wrote:
>>> 
>>> Hi!
>>> 
>>> to test the latest "libart" stuff I upgraded all GNUStep on the old and 
>>> venerable MIPS Book Letux400
>>> 
>>> I got everything to build! yeah!
>>> 
>>> If I export the display through ssh, everything works and (albeit slow... 
>>> the Letux had the LAN connected through USB on the board) I get a fine 
>>> looking GNUstep!
>>> 
>>> If I use it locally, however, I get an immediate crash:
>>> 
>>> (gdb) r
>>> Starting program: /home/usr-GNUstep/Local/Tools/Ink
>>> [Thread debugging using libthread_db enabled]
>>> [New Thread 16384 (LWP 330)]
>>> Xlib:  extension "SYNC" missing on display ":0.0".
>>> *** glibc detected *** double free or corruption (out): 0x00636300 ***
>>> 
>>> Program received signal SIGABRT, Aborted.
>>> [Switching to Thread 16384 (LWP 330)]
>>> 0x2b898a94 in kill () from /lib/libc.so.6
>>> (gdb) bt
>>> #0  0x2b898a94 in kill () from /lib/libc.so.6
>>> #1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
>>> #2  0x2b674c00 in raise () from /lib/libpthread.so.0
>>> #3  0x2b89a190 in abort () from /lib/libc.so.6
>>> #4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
>>> #5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
>>> Previous frame identical to this frame (corrupt stack?)
>>> 
>>> this does not look very useful and looks like very early memory corruption?
>>> 
>>> I recompiled all back with debug=yes, I hope that still disables 
>>> optimizations correctl. I do not get a better stack. and not a better 
>>> outcome either!
>>> 
>>> Could someone using art on x86 or amd64 try valgrind?
>>> 
>> 
>> 
>> Give more information, please.
>> Why do you think it’s art backend? Did you try other 
>> backends/compiler/linker?
>> What are your build options (runtime, library combo, compiler, linker)?
>> 
>> P.S.: Does it mean you can build art backend without ftfont-old.m (with my 
>> changes to art)?
>> 
>> Sergii
> 




Re: crash when using local display but not remote

2020-01-25 Thread Bertrand Dekoninck
Hi Riccardo !
I’ve installed art on debian buster x64 with the gnustep runtime (2.0) from 
latest source on github.com/gnustep (does it answer your PS, Sergii ?)
I did’n’t try Ink, but Gworkspace works fine (well not its inspectors, of 
course) locally. I don’t see your error, Ricardo. Has this error something to 
do with the "Sync extension «  not present on your system xlib ?

What should I do with valgrind ?

Bertrand

> Le 25 janv. 2020 à 00:13, cobjective  a écrit :
> 
> Hi, Riccardo!
> 
>> On Jan 24, 2020, at 00:51, Riccardo Mottola  
>> wrote:
>> 
>> Hi!
>> 
>> to test the latest "libart" stuff I upgraded all GNUStep on the old and 
>> venerable MIPS Book Letux400
>> 
>> I got everything to build! yeah!
>> 
>> If I export the display through ssh, everything works and (albeit slow... 
>> the Letux had the LAN connected through USB on the board) I get a fine 
>> looking GNUstep!
>> 
>> If I use it locally, however, I get an immediate crash:
>> 
>> (gdb) r
>> Starting program: /home/usr-GNUstep/Local/Tools/Ink
>> [Thread debugging using libthread_db enabled]
>> [New Thread 16384 (LWP 330)]
>> Xlib:  extension "SYNC" missing on display ":0.0".
>> *** glibc detected *** double free or corruption (out): 0x00636300 ***
>> 
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 16384 (LWP 330)]
>> 0x2b898a94 in kill () from /lib/libc.so.6
>> (gdb) bt
>> #0  0x2b898a94 in kill () from /lib/libc.so.6
>> #1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
>> #2  0x2b674c00 in raise () from /lib/libpthread.so.0
>> #3  0x2b89a190 in abort () from /lib/libc.so.6
>> #4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
>> #5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
>> Previous frame identical to this frame (corrupt stack?)
>> 
>> this does not look very useful and looks like very early memory corruption?
>> 
>> I recompiled all back with debug=yes, I hope that still disables 
>> optimizations correctl. I do not get a better stack. and not a better 
>> outcome either!
>> 
>> Could someone using art on x86 or amd64 try valgrind?
>> 
> 
> 
> Give more information, please.
> Why do you think it’s art backend? Did you try other backends/compiler/linker?
> What are your build options (runtime, library combo, compiler, linker)?
> 
> P.S.: Does it mean you can build art backend without ftfont-old.m (with my 
> changes to art)?
> 
> Sergii




Re: crash when using local display but not remote

2020-01-24 Thread cobjective
Hi, Riccardo!

> On Jan 24, 2020, at 00:51, Riccardo Mottola  
> wrote:
> 
> Hi!
> 
> to test the latest "libart" stuff I upgraded all GNUStep on the old and 
> venerable MIPS Book Letux400
> 
> I got everything to build! yeah!
> 
> If I export the display through ssh, everything works and (albeit slow... the 
> Letux had the LAN connected through USB on the board) I get a fine looking 
> GNUstep!
> 
> If I use it locally, however, I get an immediate crash:
> 
> (gdb) r
> Starting program: /home/usr-GNUstep/Local/Tools/Ink
> [Thread debugging using libthread_db enabled]
> [New Thread 16384 (LWP 330)]
> Xlib:  extension "SYNC" missing on display ":0.0".
> *** glibc detected *** double free or corruption (out): 0x00636300 ***
> 
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 16384 (LWP 330)]
> 0x2b898a94 in kill () from /lib/libc.so.6
> (gdb) bt
> #0  0x2b898a94 in kill () from /lib/libc.so.6
> #1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
> #2  0x2b674c00 in raise () from /lib/libpthread.so.0
> #3  0x2b89a190 in abort () from /lib/libc.so.6
> #4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
> #5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
> Previous frame identical to this frame (corrupt stack?)
> 
> this does not look very useful and looks like very early memory corruption?
> 
> I recompiled all back with debug=yes, I hope that still disables 
> optimizations correctl. I do not get a better stack. and not a better outcome 
> either!
> 
> Could someone using art on x86 or amd64 try valgrind?
> 


Give more information, please.
Why do you think it’s art backend? Did you try other backends/compiler/linker?
What are your build options (runtime, library combo, compiler, linker)?

P.S.: Does it mean you can build art backend without ftfont-old.m (with my 
changes to art)?

Sergii


crash when using local display but not remote

2020-01-23 Thread Riccardo Mottola

Hi!

to test the latest "libart" stuff I upgraded all GNUStep on the old and 
venerable MIPS Book Letux400


I got everything to build! yeah!

If I export the display through ssh, everything works and (albeit 
slow... the Letux had the LAN connected through USB on the board) I get 
a fine looking GNUstep!


If I use it locally, however, I get an immediate crash:

(gdb) r
Starting program: /home/usr-GNUstep/Local/Tools/Ink
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 330)]
Xlib:  extension "SYNC" missing on display ":0.0".
*** glibc detected *** double free or corruption (out): 0x00636300 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread 16384 (LWP 330)]
0x2b898a94 in kill () from /lib/libc.so.6
(gdb) bt
#0  0x2b898a94 in kill () from /lib/libc.so.6
#1  0x2b674b88 in pthread_kill () from /lib/libpthread.so.0
#2  0x2b674c00 in raise () from /lib/libpthread.so.0
#3  0x2b89a190 in abort () from /lib/libc.so.6
#4  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
#5  0x2b8d6294 in __fsetlocking () from /lib/libc.so.6
Previous frame identical to this frame (corrupt stack?)

this does not look very useful and looks like very early memory corruption?

I recompiled all back with debug=yes, I hope that still disables 
optimizations correctl. I do not get a better stack. and not a better 
outcome either!


Could someone using art on x86 or amd64 try valgrind?