Can I ask why you chose not to use pthreads to start with? I'd like to understand better why folks would choose wasm workers over pthreads.
On Fri, May 26, 2023 at 3:25 AM 'Dieter Weidenbrück' via emscripten-discuss <emscripten-discuss@googlegroups.com> wrote: > Hi Sam, > IIRC, when I started with Emscripten a while ago the program would abort > in case of a memory error. As my app is comparable to a desktop app, this > was not acceptable, so I set ABORTING_MALLOC to 0. I understand that this > flag has a different meaning today. Here is how all my allocation calls > work: > > Error_T allocMemPtr(MemPtr_T *p,uint32_T size,boolean_T clear) { > _MemPtr_T mp; > > if (clear) > mp = (_MemPtr_T)calloc(1,size + sizeof(_Mem_T)); > else > mp = (_MemPtr_T)malloc(size + sizeof(_Mem_T)); > if (mp) { > mp->size = size; > *p = (MemPtr_T)((char_T*)mp + sizeof(_Mem_T)); > return kErr_NoErr; > } > return kErr_MemErr; > } > Error_T setMemPtrSize(MemPtr_T *p,uint32_T size){ > _MemPtr_T m = _MP(*p); > MemPtr_T newPtr; > > newPtr = realloc(m,size + sizeof(_Mem_T)); > if (newPtr) { > m = (_MemPtr_T)newPtr; > m->size = size; > *p = (MemPtr_T)((char_T*)m + sizeof(_Mem_T)); > return kErr_NoErr; > } > return kErr_MemErr; > } > > So I should catch all errors. However, errors (i.e. return value == 0) > are not reported by malloc or calloc during the problems I am experiencing. > I added debug lines, but not a single failure was recorded. > Removing ABORTING_MALLOC did not result in any change of error outcome. > > I see two different behaviors now: > - setting up workers and checking that they run by > static void startUpWorker(void) { > #ifdef __EMSCRIPTEN__ > int32_T w = emscripten_wasm_worker_self_id(); > if (! emscripten_current_thread_is_wasm_worker()){ > EM_ASM_({ > console.log("Error: No worker: " + $0); > },w); > } > #endif //__EMSCRIPTEN__ > } > - then I do my stuff and receive about 10 of the "Uncaught RuntimeError: > memory access out of bounds" errors. > - no failures of malloc/calloc recognized > > The second behavior is > - in main() I call this routine: > static void memtest(void) { > #define NUM_CHUNKS 15 > const int CHUNK_SIZE = 100 * 1024 * 1024; > int i; > void* p[NUM_CHUNKS]; > Error_T err = kErr_NoErr; > > for (int i = 0; i < NUM_CHUNKS; i++) { > err = allocMemPtr(&p[i],CHUNK_SIZE,FALSE); //see function above > if (err != kErr_NoErr || p[i] == NULLPTR) { > printf("Error chunk %d\n",i); > break; > } > } > for (int i = 0; i < NUM_CHUNKS; i++) { > if (p[i] == NULLPTR) > break; > disposeMemPtr(p[i]); > } > } > - then I start up the workers as described above > - then I do my stuff > - sometimes this results in error free behavior, but not always. If an > error occurs, I only get one "Uncaught RuntimeError" message. > > I am pretty confident that I handle memory allocation correctly, because > my background is in development of desktop apps in C for 30+ years, and > there you better not have any leaks and keep the app running whenever > possible. So I must be doing something wrong when dealing with multiple > threads. > I will try out pthreads next, because I have no idea anymore what the > cause could be here. > > Cheers, > Dieter > s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 23:20:33 UTC+2: > >> Is there some reason you added `-sABORTING_MALLOC=0`.. that looks a >> little suspicious, since it means the program can continue after malloc >> fails.. which mean that any callsite that doesn't check the return value of >> malloc can lead to segfaults. If you remove that setting does the >> behaviour change? >> >> >> >> On Thu, May 25, 2023 at 1:27 PM 'Dieter Weidenbrück' via >> emscripten-discuss <emscripte...@googlegroups.com> wrote: >> >>> Hi Sam, >>> >>> I can run the code in a single thread without problems, and I have done >>> that for a while. So I assume that the code is stable. >>> >>> Here is the command line I use in a .bat file: >>> emcc ./src/main.c ^ >>> ... >>> ./src/w_com.c ^ >>> -I ./include/ ^ >>> -g3 ^ >>> --source-map-base ./ ^ >>> -gsource-map ^ >>> -s ALLOW_MEMORY_GROWTH=1 ^ >>> -s ENVIRONMENT=web,worker ^ >>> --shell-file ./index_template.html ^ >>> -s SUPPORT_ERRNO=0 ^ >>> -s MODULARIZE=1 ^ >>> -s ABORTING_MALLOC=0 ^ >>> -sWASM_WORKERS ^ >>> -s "EXPORT_NAME='wasmMod'" ^ >>> -s EXPORTED_FUNCTIONS="['_malloc','_free','_main']" ^ >>> -s EXPORTED_RUNTIME_METHODS= >>> "['cwrap','UTF16ToString','UTF8ToString','stringToUTF8','allocateUTF8']" >>> ^ >>> -o index.html >>> >>> I will start familiarizing myself with pthreads to test whether that >>> would work better. >>> >>> BTW, as an old C programmer I am fascinated by emscripten and its >>> possibilities. Excellent job! >>> >>> Cheers, >>> Dieter >>> >>> s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 20:29:58 UTC+2: >>> >>>> This looks like some kind of memory corruption, most likely due to the >>>> use of muiltithreading/wasm_workers Are you able to build a single >>>> threaded version of your program, or one that uses normal pthreads rather >>>> than wasm workers? >>>> >>>> Also, can you share the full link command you are using? >>>> >>>> cheers, >>>> sam >>>> >>>> On Thu, May 25, 2023 at 9:20 AM 'Dieter Weidenbrück' via >>>> emscripten-discuss <emscripte...@googlegroups.com> wrote: >>>> >>>>> This is a memory snapshot when using SAFE_HEAP. So here I am quite >>>>> below the browser limits, still the segfault occurs in different places. >>>>> Ignore the first console line, it results from Norton Utilities I >>>>> think. >>>>> >>>>> [image: error2.png] >>>>> >>>>> Dieter Weidenbrück schrieb am Donnerstag, 25. Mai 2023 um 18:06:27 >>>>> UTC+2: >>>>> >>>>>> Hi Sam, >>>>>> I noticed already that I am bumping against browser limits, >>>>>> especially with sanitizer switched on, so I reduced the pre-allocation >>>>>> calls. >>>>>> It turns out that asan uses so much memory that I can't use it to >>>>>> analyze this case. >>>>>> >>>>>> I use >>>>>> -s ALLOW_MEMORY_GROWTH=1 >>>>>> but don't specify any MAXIMUM_MEMORY. >>>>>> >>>>>> No pthreads version so far. I might try this next. >>>>>> >>>>>> Cheers, >>>>>> Dieter >>>>>> >>>>>> s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 17:55:41 >>>>>> UTC+2: >>>>>> >>>>>>> Firstly, if you are allocating 1.8Gb you are likely pushing up >>>>>>> against browser limits. Are you specifying a MAXIMUM_MEMORY of larger >>>>>>> than >>>>>>> 2GB? >>>>>>> >>>>>>> Secondly, it looks like you are using wasm workers, which are still >>>>>>> relatively new. Do you have a version of your code that uses pthreads >>>>>>> instead? It might tell is if the issue is related to wasm workers. >>>>>>> >>>>>>> cheers, >>>>>>> sam >>>>>>> >>>>>>> On Thu, May 25, 2023 at 8:06 AM 'Dieter Weidenbrück' via >>>>>>> emscripten-discuss <emscripte...@googlegroups.com> wrote: >>>>>>> >>>>>>>> The joy was premature, even with pre-allocated heap size segfaults >>>>>>>> occur. :( >>>>>>>> >>>>>>>> Dieter Weidenbrück schrieb am Donnerstag, 25. Mai 2023 um 16:28:37 >>>>>>>> UTC+2: >>>>>>>> >>>>>>>>> All, >>>>>>>>> I am experiencing segmentation faults when using wasm workers. >>>>>>>>> Overview: >>>>>>>>> I am working on a project with considerable 3D data sets. The code >>>>>>>>> has been stable for a while when running in the main thread alone. >>>>>>>>> Then I >>>>>>>>> started using js workers (no shared memory), and again all was well. >>>>>>>>> Now I've switched to SharedArrayBuffers and wasm workers, and I >>>>>>>>> keep running into random problems. >>>>>>>>> I have prepared the code such that I can run with 0 workers up to >>>>>>>>> hardware.concurrency workers. All is well with 0 workers, but as soon >>>>>>>>> as I >>>>>>>>> use one or more workers, I keep getting segfaults because of invalid >>>>>>>>> pointers, access out of bounds and similar. >>>>>>>>> >>>>>>>>> What happens in main thread and what in the wasm workers: >>>>>>>>> I allocate all objects in the main thread when importing the 3D >>>>>>>>> file. Then i fire off a function for each object that will do some >>>>>>>>> serious >>>>>>>>> calculations of the data, including allocating and disposing of >>>>>>>>> memory. The >>>>>>>>> workers allocate approx. 300 to 400 MB in addition to the main >>>>>>>>> thread. All >>>>>>>>> this happens in the same sharedArrayBuffer, of course. >>>>>>>>> >>>>>>>>> Here is what I've tried so far: >>>>>>>>> - compiling with SAFE_HEAP=1 >>>>>>>>> not a lot of helpful information, >>>>>>>>> - compiling with -fsanitize=address >>>>>>>>> everything works without problems here! >>>>>>>>> - compiling with ASSERTIONS=2 >>>>>>>>> gave me this information: >>>>>>>>> [image: error.png] >>>>>>>>> >>>>>>>>> To me it looks like another resize call is executed while other >>>>>>>>> workers keep working on the buffer, and then something gets into >>>>>>>>> conflict. >>>>>>>>> To test this, I allocated 1.8 GB right after startup in the main >>>>>>>>> thread and disposed the mem blocks again just to trigger heap resize. >>>>>>>>> After >>>>>>>>> that everything works like a charm. >>>>>>>>> >>>>>>>>> Is there anything I am doing wrong? >>>>>>>>> Sorry for not providing a sample, but there is a lot of code >>>>>>>>> involved, and it is not easy to simulate this behavior. Happy to >>>>>>>>> answer >>>>>>>>> questions. >>>>>>>>> >>>>>>>>> All comments are appreciated. >>>>>>>>> Thanks, >>>>>>>>> Dieter >>>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "emscripten-discuss" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to emscripten-disc...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/emscripten-discuss/80d56314-59d8-4332-bb2e-ebe00fe52ea3n%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/emscripten-discuss/80d56314-59d8-4332-bb2e-ebe00fe52ea3n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "emscripten-discuss" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to emscripten-disc...@googlegroups.com. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/emscripten-discuss/cfc03512-f69f-44b0-8c14-1f1a8e4ffe9fn%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/emscripten-discuss/cfc03512-f69f-44b0-8c14-1f1a8e4ffe9fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "emscripten-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to emscripten-disc...@googlegroups.com. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/emscripten-discuss/e568e189-4259-460f-9601-e7996927cdb7n%40googlegroups.com >>> <https://groups.google.com/d/msgid/emscripten-discuss/e568e189-4259-460f-9601-e7996927cdb7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "emscripten-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to emscripten-discuss+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/emscripten-discuss/b20d2de8-2532-4441-b8fc-3ef8f049f7f0n%40googlegroups.com > <https://groups.google.com/d/msgid/emscripten-discuss/b20d2de8-2532-4441-b8fc-3ef8f049f7f0n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/CAL_va28k7RyF2n-x6B8M9pbgri2bCDCQA7N%2BG7x-6GVP%2Bpqumg%40mail.gmail.com.