Hi JPR,

Thanks for your comments.

- On each crash report, it is a different thread. Twice it was similar
to what I'll post at the end of this message (ServerNet select I/O
handler).
Once it was the LabProjects List, but there is nothing unique about
that list of records.
 - Range checking is on (the application always runs compiled)
 - It could be related to network stress - typically is does happen at
busier hours (never after hours)
 - I do generate debug files. It's not a specific method that is
running. It varies. The last command in the debug file always has a
"." after it. My understanding is that means the command executed
complete.
 - I do use interprocess variables to cache employee data for fast
access to names, email addresses, etc. It is relatively small with 7
parallel arrays containing less that 150 elements each. Also some
system settings - also under 100 elements.
 - The cache is set to 1GB. The datafile is 3GB in size.
- I use the 4D Info Reporter. Tim has walked me through looks at the
results. At first it looked like the Server was running low after a
backup, but I wrote in a purge command that clears it up. At the time
of each crash there is nothing remarkable in the report.

I think you are correct, that it probably is not a client issue -
though I do use routines that have the "execute on server" box
checked. Either way, I uploaded a modification last night that turns
on client debugging and creates a session record in a table at the
start of a client session. If the client record is not closed out via
the "On Exit" method, when the user logs in again the system will
upload their debug files (max of two are created). On the next crash
I'll take a closer look to see what clients were doing.

One thing that bothers me, is on occasion the Administration interface
begins to no longer display information. For example, when I went to
quit the application last night for and update, the window appeared
asking how to quit. I told the system to shutdown in 1 minute. The
next dialog contained only a server icon, and the countdown clock
stuck at "00 00". No text or message as displayed. The server did
shutdown as requested in 1 minute.

Thanks for your questions. Another sample crash report is below.

dave


Thread 29 Crashed:: ServerNet select I/O handler (id = 90423)
0   com.4d.ServerNet                  0x0000000110d5837e
xbox::VTCPSelectWatchAction::HandleError(fd_set*) + 38
1   com.4d.ServerNet                  0x0000000110d589ea
xbox::VTCPSelectIOHandler::DoRun() + 712
2   com.4d.ServerNet                  0x0000000110d58afd non-virtual
thunk to xbox::VTCPSelectIOHandler::DoRun() + 13
3   com.4d.kernel                     0x0000000110bbadaa
xbox::VTask::_Run() + 234
4   com.4d.kernel                     0x0000000110bbfb01
xbox::XMacTask_preemptive::_ThreadProc(void*) + 145
5   libsystem_pthread.dylib           0x00007fff6e307661 _pthread_body + 340
6   libsystem_pthread.dylib           0x00007fff6e30750d _pthread_start + 377
7   libsystem_pthread.dylib           0x00007fff6e306bf9 thread_start + 13

On Tue, Sep 4, 2018 at 10:40 AM JPR via 4D_Tech <4d_tech@lists.4d.com> wrote:
>
> [JPR]
>
> Hi Dave, Tim,
>
> This kind of crash is always difficult to track down, for it is not easily 
> reproductible. From what I see (and as Tim pointed) it seems there is a 
> memory problem that is revelated in the process LabProjects List. But a 
> memory problem can occur a while before the actual crash, because the 
> application may have a corrupted memory and not be aware of it until the 
> crash.
>
> - Is your application compiled? If yes, be sure that the Range checking 
> option is set.
> - Is the LabProjects ListProcess a client process on server, or a worker or 
> process running on the server?
> - The time of crash seems irrelevant, but may be it's linked to a peak in 
> activity and a server or network stress?
> - A client problem causing a server crash is unlikely, but it may help to 
> know if there is a correlation between the crash and a particular client 
> doing a particular operation.
> - Do you know which method is executed when it crashes?
> - Do you use interprocess variables like arrays for instance?
> - How much memory has been given to the server and to the cache?
>
> This is just a short list of points to check, but it may help to reduce the 
> problem to a small part of the application.
>
> My very best,
>
> JPR
>
>
> > On 2 Sep 2018, at 21:00, 4d_tech-requ...@lists.4d.com wrote:
> >
> > From: Tim Nevels <timnev...@mac.com>
> > To: 4d_tech@lists.4d.com
> > Subject: Re: Isolating the Cause of a Server Crash
> > Message-ID: <be3bf13d-9f79-4715-aadf-240c4c189...@mac.com>
> > Content-Type: text/plain; charset=utf-8
> >
> > On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote:
> >
> >> One of our systems is crashing about every 3 days and I can't seem to
> >> isolate the cause. Lately these are crashes with a Mac crash report
> >> appearing on the screen.
> >> Some system details are:
> >> - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and
> >> 32 bit Windows Clients)
> >> - Mac and Windows Clients
> >> - Mac OS 10.13.5
> >>
> >> What I know so far:
> >> - I have the Server Debug file. It ends with a "." and so the last
> >> command appears to have executed.
> >> - I'm using the Report Info component, logging every 5 minutes. There
> >> doesn't seem to be memory problems or run away cache issues.
> >> - I also know who was one each time it crashes and said out an email
> >> to those users to find patterns (so far I've found none).
> >> - The crashes typically happen around 10am to 11am.
> >> - The client and server builds match.
> >>
> >> I'm debating turning on the client debugger files and then harvesting
> >> them afterwards when the user logs back in. I'm open to other
> >> debugging techniques.
> >>
> >> There are other v17 systems running on the same machine with zero issue.
> >>
> >> Below is a snippet of the crash report. It seems to be different each
> >> time, but here is the latest. Thread 73 crashed, so I only included
> >> that one.
> >>
> >> Thanks,
> >>
> >> dave nasralla
> >> ------------------------------------
> >> Process:               Corporate [93958]
> >> Path:                  /Users/USER/*/Corporate
> >> Server.app/Contents/MacOS/Corporate
> >> Identifier:            4d.com.Corporate Server.app
> >> Version:               17.0 build 17.226566 (???)
> >> Code Type:             X86-64 (Native)
> >> Parent Process:        ??? [1]
> >> Responsible:           Corporate [93958]
> >> User ID:               501
> >>
> >> Date/Time:             2018-08-31 11:00:05.952 -0500
> >> OS Version:            Mac OS X 10.13.5 (17F77)
> >> Report Version:        12
> >> Anonymous UUID:        723511FD-4CA0-6E8B-0642-883209248DFC
> >>
> >>
> >> Time Awake Since Boot: 3700000 seconds
> >>
> >> System Integrity Protection: enabled
> >>
> >> Crashed Thread:        73  LabProjects List (id = -114)
> >>
> >> Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
> >> Exception Codes:       EXC_I386_GPFLT
> >> Exception Note:        EXC_CORPSE_NOTIFY
> >>
> >> Termination Signal:    Segmentation fault: 11
> >> Termination Reason:    Namespace SIGNAL, Code 0xb
> >> Terminating Process:   exc handler [0]
> >> ----------------------------------------------------------
> >>
> >>
> >> Thread 73 Crashed:: LabProjects List (id = -114)
> >> 0   4d.com.Corporate Server.app       0x000000010694fdbe
> >> V4DConnection::OnPostpone(bool) + 40
> >> 1   4d.com.Corporate Server.app       0x0000000106b095f7
> >> V4DServerUser::PostponeServiceConnection() + 35
> >> 2   4d.com.Corporate Server.app       0x0000000106b20567
> >> V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*,
> >> short) + 395
> >> 3   4d.com.Corporate Server.app       0x0000000106b211ca
> >> V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100
> >
> > Hi Dave,
> >
> > Crashing every 3 days is a real problem and totally unacceptable. So what 
> > can be done to try and make this situation better? We need to make changes 
> > to make this crashing stop. But what changes?
> >
> > Here is my thinking as I read this crash report. Keep in mind I’m not an 
> > expert on this, so I may be wrong in some areas. If I am wrong hopefully 
> > those that know more can correct me — and in turn help me and others 
> > understand more about how to read these macOS crash reports. (Thinking 
> > about Miyako, JPR, Christian Sakowski and Rob Laveaux — they are real 
> > experts in this area. Real macOS programmers that know how to read these 
> > things properly.)
> >
> > The crash report is supposed to provide a programmer with information on 
> > exactly here the program crashed and the cause of the crash. If you have 
> > the special 4D “debug” version it will contain more “symbols” and thus when 
> > 4D crashes you get better names for functions instead of just memory 
> > address offset. I think you even get 4D command names that were involved in 
> > the crash. But the basic crash dump info that we have here can help point 
> > to the general area of concern. Here is a website that helps explain crash 
> > dumps and how to read them:
> >
> > https://www.maketecheasier.com/read-macos-crash-reports-troubleshoot-mac/
> >
> > This is 4D v17.0 build 226566 that is running compiled in 64bit mode (Code 
> > Type: x86-64). So first thought is that this could be a 4D 64bit issue. 
> > That’s important because some of the code is completely different between 
> > 32bit 4D and 64bit 4D. The 64bit code could be newly written code, the 
> > 32bit code could be legacy code that has been around for years.
> >
> > Thread 73 “LabProjects List” is what crashed. Do you have a table named 
> > “LabProjects” or maybe a MODIFY SELECTION or a listbox window that shows 
> > records in this table? Or a process that has that name? Makes me think that 
> > you do. That’s another pointer to where in your application the crashing 
> > problem occurred.
> >
> > Exception Type is "EXC_BAD_ACCESS (SIGSEGV)” and that means "the program 
> > attempts to access memory incorrectly or with an invalid address”. Could be 
> > a C pointer that went bad or something doing with virtual memory or even 
> > how 4D allocates its own memory internally. Could be 4D data cache related. 
> > Basically 4D tried to access memory is was not allowed to access and macOS 
> > killed 4D so that it could not damage other parts of the system and cause 
> > them to crash. Thank you macOS for watching out and protecting us from 
> > complete system corruption and crashing. Windows does this too.
> >
> > The last area is where we can see exactly where in 4D — and even the 4D C 
> > or Objective C function name — that was running when macOS said “enough, 
> > this application has gone crazy, I need to kill it before it does damage to 
> > other applications.” The functions are listed in reverse chronological 
> > order, so the one at the bottom is where the “call chain” started. The one 
> > at the top is where it died.
> >
> > The function name is "V4DConnection::OnPostpone(bool)” and at the code at 
> > 40 bytes from the start of that function is where the offending memory 
> > address statement occurred. The name “V4DConnection” makes me think this is 
> > related to networking, 4D Server handling network actions with 4D Client. 
> > The “OnPostpone” makes me think this is somehow related to sleeping or a 4D 
> > Client connection that has been asleep and needs to now wake up. And lastly 
> > it make me think “this is related to the new network layer code”. Again, 
> > this is just my thinking. I could be completely wrong about all of this.
> >
> > So now my brain tries to build a scenario that could most likely happen 
> > that could be connected to this situation. Happens during the day between 
> > 10am and 11am. It’s a work day with users connected. People came in to work 
> > got connected to 4D Server, then wandered off to a meeting or something and 
> > their computer went to sleep. You are using 4D Server compiled 64bit so you 
> > MUST be using the new network layer. Legacy is only available in 32bit 
> > compiled 4D Server macOS.
> >
> > There is this new network layer feature where if a 4D Client machine goes 
> > into sleep mode you don’t lose your 4D Server connection. So that when the 
> > user wakes up the 4D Client machine it notifies 4D Server and the old 
> > network connection is reenergized and brought back to life. That 
> > “OnPostpone” mention above makes me think this also. Maybe something went 
> > wrong in that area of 4D. It is a tricky area because sleep could last for 
> > hours or days and memory could be moved around and pointer can easily go 
> > bad in those type of situations.
> >
> > So there is my analysis. Now what changes could you make to stop these damn 
> > crashing situations? Here are some idea:
> >
> > - You say it happens about every 3 days, so just restart 4D Server every 
> > single day. Giant PITA I know. But just an idea for what to do now to 
> > eliminate the crashing.
> >
> > - Stop all 4D Client machines from sleeping. You’d have to physically go to 
> > every machine and turn off system sleeping and allow the display to go to 
> > sleep. You can’t rely on users to do this, and do it right. This is what I 
> > would do, if I had physical access to all the machine — or at least RDP 
> > access — so that I could make sure every machine had system sleep turned 
> > off. (Of course you already have App Napping turned off on the 4D Server 
> > machine so that’s not part of this issue, right?)
> >
> > - Crash dump lists Build Number 226566. v17.0 has build 225365. v17.0 HF1 
> > has build 226237. A quick check of 4D forums “Nightly Builds 4D v17” shows 
> > this build is from 8/22/18. So you are running a nightly build. I’m 
> > guessing you used v17.0 and had problems, went to v17.0 HF1 and still had 
> > problems, so you went to nightly builds to try and find a fix. Maybe you 
> > keep doing that. Current nightly build is 226837. You may find they’ve 
> > fixed the bug that is biting you.
> >
> > - Stop using the new network layer. You would have to stop using 64bit 4D 
> > Server so the many not be a viable option. You are limited to a 2GB data 
> > cache. But maybe if you can stop the crashing now it worth that limitation. 
> > That means compiling a 32bit version of 4D Server and 4D Client, and 
> > replacing all the 64bit 4D Client applications with the 32bit version. I 
> > think you could use the auto client update feature to automate this.
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> Archive:  http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **********************************************************************



-- 
David Nasralla
Clean Air Engineering
**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Reply via email to