Re: [s3ql] S3QL crashing when backing up to cloud using WiFi (Google) but not when backing up locally or through cable

2020-12-29 Thread Jules F.

>
> I do not know why you do not get any exception backtrace information but 
> S3QL did raise an exception (and the last resort exception hook logs that). 
> It might be that your trio version is too old or new for S3QL. (I am using 
> 0.15.0 and this version works good with S3QL)
>

Installation says it's using trio 0.17.0 (and verified this is what it's 
actually using). 

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to s3ql+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/9bdda5eb-619b-47fd-aaf4-3e085511add4n%40googlegroups.com.


Re: [s3ql] S3QL crashes when uploading large files with inconclusive logs (yet again)

2020-12-29 Thread Ivan Shapovalov
On 2020-12-29 at 09:39 +, Nikolaus Rath wrote:
> On Dec 28 2020, Ivan Shapovalov  wrote:
> > On 2020-12-28 at 13:41 +, Nikolaus Rath wrote:
> > > 
> > 
> > Indeed it does. I have added some proper exception logging and
> > found
> > the actual problem, which is — unsurprisingly — combination of user
> > error, unclear system requirements and broken logging.
> 
> Great, thanks for your help!
> 
> Could you tell what exactly you changed to make the exception
> information appear?

Sure, I'll send a PR fixing exception logging shortly.

> 
> 
> 
> I consider this a bug in the B2 backend (and other backends may have
> the
> same problem). If the backend returns an exception from write(), this
> should not result in a second exception from close(). Either write()
> should update the checksum to reflect the partial data that was
> written
> (thus eliminating the checksum error on upload), or perhaps it should
> set a flag that this object should not be uploaded at all on close.
> 
> https://github.com/s3ql/s3ql/issues/228
> 

As far as I remember from my brief digging, all backends are
implemented similarly in this regard. Any exception in ObjectW.write()
will bubble up through do_write() in BlockCache._do_upload() to
AbstractBackend.perform_write(), which holds the ObjectW as a context
manager, and exit it, thus calling ObjectW.close().

> > 
> > 
> > I guess this was causing the 400 as well, because we were sending
> > the
> > temporary file including the last partial write, but the locally
> > computed hash did not include it.
> 
> Exactly.
> 
> > There is still a problem of getting the dugong.StateError instead
> > of a
> > 400. I don't know why it started giving me this, even after rolling
> > back all the tentative patches. Though I'm sure it will be
> > something
> > trivial.
> 
> My guess is that dugong is expecting more data in the request (thus
> the request is not considered complete and there is no response that
> could be read). However, looking at the code, I do not immediately
> see
> why this would be the case.

Yeah, me neither.

> Best,
> -Nikolaus

Thanks,
-- 
Ivan Shapovalov / intelfx /

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to s3ql+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/5620a015df04eac8ddae018ec82b54dd8b7a03fe.camel%40intelfx.name.


signature.asc
Description: This is a digitally signed message part


Re: [s3ql] S3QL crashes when uploading large files with inconclusive logs (yet again)

2020-12-29 Thread Nikolaus Rath
On Dec 28 2020, Ivan Shapovalov  wrote:
> On 2020-12-28 at 13:41 +, Nikolaus Rath wrote:
>> On Dec 28 2020, Ivan Shapovalov  wrote:
>> > 2020-12-27 19:04:33.819 211867 DEBUG    Thread-1
>> > s3ql.backends.b2.b2_backend._do_request: RESPONSE: POST 400  97
>> > 2020-12-27 19:04:33.820 211867 DEBUG    MainThread
>> > s3ql.block_cache.with_event_loop: upload of 8652 failed
>> > NoneType: None
>> > 2020-12-27 19:04:33.827 211867 DEBUG Thread-1 s3ql.mount.exchook:
>> > recording exception 400
>> > : bad_request - Checksum did not match data received
>> > zsh: terminated  mount.s3ql b2:// /mnt/b2/files -o
>> > -- 8< --
>> > 
>> > Leaving out the question of why journald eats the last line, the
>> > situation is pretty clear. The backend (B2Backend._do_request)
>> > raises
>> > an exception (B2Error) which is not considered a "temporary
>> > failure".
>> > 
>> > I have just patched up error handling in the B2 backend to consider
>> > the
>> > checksum mismatch a transient failure (testing now).
>> 
>> Is B2 not using SSL for its data connection? That should make sure
>> that
>> there are no checksum errors
>
> Indeed it does. I have added some proper exception logging and found
> the actual problem, which is — unsurprisingly — combination of user
> error, unclear system requirements and broken logging.

Great, thanks for your help!

Could you tell what exactly you changed to make the exception
information appear?


>
> The B2 backend creates a temporary file for each object that is being
> uploaded. My s3ql instance has object size = 1 GiB, and with threads=8,
> that means at most 8 GiB worth of temporary files at once. Thing is,
> temporary files are created in /tmp, which is a tmpfs and has a size
> limit.
>
> -- 8< --
> 2020-12-28 16:39:12.924 340652 ERRORThread-3 s3ql.mount.exchook: 
> Unhandled exception in thread, terminating
> Traceback (most recent call last):
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/common.py", line 279, 
> in perform_write
> return fn(fh) 
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/block_cache.py", line 334, in 
> do_write
> fh.write(buf) 
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/b2/object_w.py", line 
> 36, in write
> self.fh.write(buf)
>  
> OSError: [Errno 28] No space left on device
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/mount.py", line 58, in 
> run_with_except_hook
> run_old(*args, **kw)  
>  
>   File "/usr/lib/python3.9/threading.py", line 892, in run
> self._target(*self._args, **self._kwargs)
>   File "/usr/lib/python3.9/site-packages/s3ql/block_cache.py", line 319, in 
> _upload_loop
> self._do_upload(*tmp) 
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/block_cache.py", line 376, in 
> _do_upload
> obj_size = backend.perform_write(do_write, 's3ql_data_%d'
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/common.py", line 108, 
> in wrapped
> return method(*a, **kw)   
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/common.py", line 279, 
> in perform_write
> return fn(fh) 
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/b2/object_w.py", line 
> 79, in __exit__
> self.close()


I consider this a bug in the B2 backend (and other backends may have the
same problem). If the backend returns an exception from write(), this
should not result in a second exception from close(). Either write()
should update the checksum to reflect the partial data that was written
(thus eliminating the checksum error on upload), or perhaps it should
set a flag that this object should not be uploaded at all on close.

https://github.com/s3ql/s3ql/issues/228


>   File "/usr/lib/python3.9/site-packages/s3ql/backends/common.py", line 108, 
> in wrapped
> return method(*a, **kw)   
>  
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/b2/object_w.py", line 
> 64, in close
> response = self.backend._do_upload_request(self.headers, self.fh)
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/b2/b2_backend.py", 
> line 291, in _do_upload_request
> response, response_body = self._do_request(upload_url_info['connection'], 
> 'POST', upload_url_info['path'], headers, body)
>   File "/usr/lib/python3.9/site-packages/s3ql/backends/b2/b2_backend.py", 
> line 235, in _do_request
> response =