so after scanning this thread and the ticket again - it is still unclear that there could be a completely universal solution.
While it would be nice if the storage API had a checksum(name) or md5(name) method - not all custom storage backends are going to support a single checksum standard. S3 doesn't explicitly support MD5 (apparently it unofficially does through ETags). Without a universal checksum - you can't use it to compare files across arbitrary backends. I do agree that hacking modified_time return value is a little ugly - the API is clearly documented as "returns a datetime..." - so returning a M55 checksum there is, well, hacky. If you are passionate about moving this forward, here is what I'd suggest. Implement, document, and test .md5(name) as a standard method on storage backends - like modified_time this would raise NotImplementedError if not available - this could easily be its own ticket. md5 is probably the closest you'll get to a checksum standard. Once you have an md5 method defined for backends - you could support a --md5 option to collectstatic that would use that as the target/source comparison. Another workaround is to just use collectstatic locally - and rsync --checksum to your remote if it supports rsync. -Preston On Sunday, October 7, 2012 8:59:16 PM UTC-7, Dan Loewenherz wrote: > > This issue just got me again tonight, so I'll try to push once more on > this issue. It seems right now most people don't care that this is broken, > which is a bummer, but in which case I'll just continue using my working > solution. > > Dan > > On Sat, Oct 6, 2012 at 10:48 AM, Dan Loewenherz <d...@dlo.me <javascript:> > > wrote: > >> Hey Jannis, >> >> On Mon, Oct 1, 2012 at 12:47 AM, Jannis Leidel <lei...@gmail.com<javascript:> >> > wrote: >> >>> >>> On 30.09.2012, at 23:41, Dan Loewenherz <d...@dlo.me <javascript:>> >>> wrote: >>> >>> > Many backends don't support last modified times, and even if they all >>> did, it's incorrect to assume that last modified time is an accurate >>> heuristic for whether a file has already been uploaded or not. >>> >>> Well but it's an accurate way to decide whether a file has been changed >>> on the filesystem, and that's what collectstatic cares about. The storage >>> backend *is* the API to extend that when needed, so feel free to use it. >>> >> >> It's accurate *only* in certain situations. And on a distributed >> development team, I've run into a lot of issues with developers re-upload >> files that have already been uploaded because they just recently updated >> their repo. >> >> A checksum is the only true accurate method to determine if a file has >> changed. >> >> Additionally, you didn't address my point that I quoted from. Storage >> backends don't just reflect filesystems--they could reflect files stored in >> a database, S3, etc. And some of these filesystems don't support last >> modified times. >> >> > It might be a better idea to let the backends decide when a file has >>> been changed (instead of just calling the backend's last modified method). >>> >>> I don't understand, you can easily implement exactly that in the >>> last_modified method if you'd like. >>> >> >> This is a bit confusing...why call it last_modified when that's doesn't >> necessarily reflect what it's doing? It would be more flexible to create >> two methods: >> >> def modification_identifier(self): >> >> def has_changed(self): >> >> Then, any backend could implement these however they might like, and >> collectstatic would have no excuse in uploading the same file more than >> once. Overloading last_modified to also do things like calculate md5's >> seems a bit hacky to me, and confusing for any developer maintaining a >> custom storage backend that doesn't support last modified. >> >> Dan >> > > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/weKD2x1XY4oJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.