Re: Compressed Pristines (Design Doc)

2012-03-26 Thread Ivan Zhakov
2012/3/25 Branko Čibej br...@e-reka.si:
 On 22.03.2012 17:01, Branko Čibej wrote:
 On 22.03.2012 16:50, Daniel Shahaf wrote:
 Branko Čibej wrote on Thu, Mar 22, 2012 at 16:37:24 +0100:
[..]

 Based on these observations, it's clear that the implementation should
 proceed as follows:

 Step 1: Just compress the pristine files, do not use any packing. This
 gives a 60% decrease in disk usage in the HTTPD case, but even if the
 decrease is only 30%, it's still worth the effort.

 Step 2: Store small (for some definition of small) compressed pristine
 files in a SQLite database. In the case of HTTPD, this gives an exter up
 to 90% savings in disk usage, but this is a very specific test case and
 it's hard to guess what kind of gain we'd get on average.

Makes sense for me. In that case we also benefit on performance (in
case sqlite blob API has acceptable performance)

And IMHO small should be really small (up to 4k) to prevent wc.db
growing in size.


-- 
Ivan Zhakov


Re: Compressed Pristines (Design Doc)

2012-03-26 Thread Branko Čibej
On 26 March 2012 11:53, Ivan Zhakov i...@visualsvn.com wrote:
 2012/3/25 Branko Čibej br...@e-reka.si:
 On 22.03.2012 17:01, Branko Čibej wrote:
 On 22.03.2012 16:50, Daniel Shahaf wrote:
 Branko Čibej wrote on Thu, Mar 22, 2012 at 16:37:24 +0100:
 [..]

 Based on these observations, it's clear that the implementation should
 proceed as follows:

 Step 1: Just compress the pristine files, do not use any packing. This
 gives a 60% decrease in disk usage in the HTTPD case, but even if the
 decrease is only 30%, it's still worth the effort.

 Step 2: Store small (for some definition of small) compressed pristine
 files in a SQLite database. In the case of HTTPD, this gives an exter up
 to 90% savings in disk usage, but this is a very specific test case and
 it's hard to guess what kind of gain we'd get on average.

 Makes sense for me. In that case we also benefit on performance (in
 case sqlite blob API has acceptable performance)

 And IMHO small should be really small (up to 4k) to prevent wc.db
 growing in size.

There's no requirement for putting pristines in the wc.db, it can
easily be a different database that's part of the same connection.
More to the point, in order to make using a database worthwhile, the
size limit shouldn't be /too/ low.

With a 4k filesystem block size, files up to 4k in size will have 50%
wasted on average; 8k files will waste 25%; and so on. My test
compared using 8k and 32k limits, and just increasing that limit added
an extra more than 50% space savings (on top of the already huge
savings of storing up-to-8k files in blobs) with no significant
difference in insertion times. (This last makes sense, as sqlite will
flush in multiples of page sizes, so the insertion times are really
proportional to the overall amount of data written. On average YMMV
disclaimer/.)

-- Brane


Re: Compressed Pristines (Design Doc)

2012-03-26 Thread Daniel Shahaf
Hyrum K Wright wrote on Fri, Mar 23, 2012 at 13:54:25 -0500:
 As mentioned elsewhere, I too was surprised by the choice of a custom
 container, though I think you make a good argument for it.  One
 simplification I was thinking about is this: what if the container
 only needed to support add and batch-delete operations?  These are the
 current contraints of the existing pristine store; would they
 introduce additional simplicity into your design?
 
 In some respects, it looks like you're solving *two* problems:
 compression and the internal fragmentation due to large FS block
 sizes.  How orthogonal are the problems?  Could they be solved
 independently of each other in some way?  I know that compression
 exposes the internal fragmentation issue, but used alone it certainly
 doesn't make things *worse* does it?
 

Personally I've also been wondering, while reading the design doc, how
applicable are the solutions to libsvn_fs -- or if they could be
modularized in a way that lets libsvn_fs re-use parts of them, etc.

I haven't found much so far, but this is another angle to look at things
from.


Re: Compressed Pristines (Design Doc)

2012-03-26 Thread Daniel Shahaf
Daniel Shahaf wrote on Mon, Mar 26, 2012 at 14:30:34 +0200:
 I haven't found much so far

(Provided as an observation; not implying it's a problem.)


Re: Compressed Pristines (Design Doc)

2012-03-26 Thread Ashod Nakashian
- Original Message -

 From: Daniel Shahaf danie...@elego.de
 To: Hyrum K Wright hyrum.wri...@wandisco.com
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
 dev@subversion.apache.org; Philip Martin philip.mar...@wandisco.com; Greg 
 Stein gst...@gmail.com
 Sent: Monday, March 26, 2012 5:30 PM
 Subject: Re: Compressed Pristines (Design Doc)
 
 Hyrum K Wright wrote on Fri, Mar 23, 2012 at 13:54:25 -0500:
  As mentioned elsewhere, I too was surprised by the choice of a custom
  container, though I think you make a good argument for it.  One
  simplification I was thinking about is this: what if the container
  only needed to support add and batch-delete operations?  These are the
  current contraints of the existing pristine store; would they
  introduce additional simplicity into your design?
 
  In some respects, it looks like you're solving *two* problems:
  compression and the internal fragmentation due to large FS block
  sizes.  How orthogonal are the problems?  Could they be solved
  independently of each other in some way?  I know that compression
  exposes the internal fragmentation issue, but used alone it certainly
  doesn't make things *worse* does it?
 
 
 Personally I've also been wondering, while reading the design doc, how
 applicable are the solutions to libsvn_fs -- or if they could be
 modularized in a way that lets libsvn_fs re-use parts of them, etc.
 
 I haven't found much so far, but this is another angle to look at things
 from.
 
This is certainly something to plan for. I didn't include such info to avoid 
widening the scope and because we haven't agreed on the design yet. I'll 
probably get to that when we have consensus on the design, which will hopefully 
be soon.

-Ash



Re: Compressed Pristines (Design Doc)

2012-03-25 Thread Thomas Åkesson
Hi Ash,

I noticed that Remove pristine store or render optional is considered a 
Non-Goal. If changes are made to wc-db in order to manage compressed pristines, 
it might make sense to ensure that the design can also handle optional 
pristines in the future.

The typical Subversion use case (code/text) will obviously benefit from 
compressed pristines. However, when storing binary files (e.g. graphics), which 
tend to be larger and less frequently modified files, optional pristines will 
likely be more beneficial. 

Thanks,
Thomas Å.  


On 22 mar 2012, at 08:15, Ashod Nakashian wrote:

 
 From: Daniel Shahaf danie...@elego.de
 To: Greg Stein gst...@gmail.com 
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
 Sent: Wednesday, March 21, 2012 2:08 PM
 Subject: Re: Compressed Pristines (Design Doc)
 
 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
 On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
 All,
 
 I'm happy to share[1] with you the design document for the Compressed 
 Pristines feature. The document is public and anyone can comment on any 
 part
 
 I can't.  Can you please move the document to our wiki, or dump it in an
 email to dev@, or on a pastebin, somewhere everyone canread it.
 
 I just opened it in an incognito window in Chrome. You should be able
 to access the thing.
 
 
 Tried, I get as far as the doc title.  I don't see its contents.
 
 
 Daniel (and all who can't access the doc),
 
 I'm attaching the PDF and ODT versions with updates based on Greg's comments. 
 I'd like to hear all opinions and comments. Google docs is a fairly ideal 
 environment for live commenting and editing, so it's too bad that you can't 
 access the file.
 
 Please let me know if you have any notes/comments on the design. If you'd 
 like to use the ODT file for comments and edits, please mark your input 
 clearly and I'll update the Google doc with your notes.
 
 Thanks,
 AshSubversionCompressedPristinesDesign.pdfSubversionCompressedPristinesDesign.odt



Re: Compressed Pristines (Design Doc)

2012-03-25 Thread Greg Stein
Yeah... optional pristines is orthogonal, and should be considered
seperately. It is also a very difficult problem because users of the
various APIs expect the pristine to always be present.

Cheers,
-g
On Mar 25, 2012 7:41 PM, Thomas Åkesson tho...@akesson.cc wrote:

 Hi Ash,

 I noticed that Remove pristine store or render optional is considered a
 Non-Goal. If changes are made to wc-db in order to manage compressed
 pristines, it might make sense to ensure that the design can also handle
 optional pristines in the future.

 The typical Subversion use case (code/text) will obviously benefit from
 compressed pristines. However, when storing binary files (e.g. graphics),
 which tend to be larger and less frequently modified files, optional
 pristines will likely be more beneficial.

 Thanks,
 Thomas Å.


 On 22 mar 2012, at 08:15, Ashod Nakashian wrote:

  
  From: Daniel Shahaf danie...@elego.de
  To: Greg Stein gst...@gmail.com
  Cc: Ashod Nakashian ashodnakash...@yahoo.com;
 dev@subversion.apache.org
  Sent: Wednesday, March 21, 2012 2:08 PM
  Subject: Re: Compressed Pristines (Design Doc)
 
  Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
  On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name
 wrote:
  Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
  All,
 
  I'm happy to share[1] with you the design document for the
 Compressed Pristines feature. The document is public and anyone can comment
 on any part
 
  I can't.  Can you please move the document to our wiki, or dump it in
 an
  email to dev@, or on a pastebin, somewhere everyone canread it.
 
  I just opened it in an incognito window in Chrome. You should be able
  to access the thing.
 
 
  Tried, I get as far as the doc title.  I don't see its contents.
 
 
  Daniel (and all who can't access the doc),
 
  I'm attaching the PDF and ODT versions with updates based on Greg's
 comments. I'd like to hear all opinions and comments. Google docs is a
 fairly ideal environment for live commenting and editing, so it's too bad
 that you can't access the file.
 
  Please let me know if you have any notes/comments on the design. If
 you'd like to use the ODT file for comments and edits, please mark your
 input clearly and I'll update the Google doc with your notes.
 
  Thanks,
 
 AshSubversionCompressedPristinesDesign.pdfSubversionCompressedPristinesDesign.odt




Re: Compressed Pristines (Design Doc)

2012-03-25 Thread Ashod Nakashian

 From: Greg Stein gst...@gmail.com
To: Thomas Åkesson tho...@akesson.cc 
Cc: Ashod Nakashian ashodnakash...@yahoo.com; Subversion Development 
dev@subversion.apache.org 
Sent: Monday, March 26, 2012 7:42 AM
Subject: Re: Compressed Pristines (Design Doc)
 

Yeah... optional pristines is orthogonal, and should be considered seperately. 
It is also a very difficult problem because users of the various APIs expect 
the pristine to always be present.
Cheers,
-g
On Mar 25, 2012 7:41 PM, Thomas Åkesson tho...@akesson.cc wrote:

Hi Ash,

I noticed that Remove pristine store or render optional is considered a 
Non-Goal. If changes are made to wc-db in order to manage compressed 
pristines, it might make sense to ensure that the design can also handle 
optional pristines in the future.

I do not mind in the least to make provisions for such a future possibility. 
But like Greg said, it's really orthogonal to the feature at hand and does 
suffer quite a bit of complexity itself to be included within the current 
scope, which is already a mouthful.

-Ash


The typical Subversion use case (code/text) will obviously benefit from 
compressed pristines. However, when storing binary files (e.g. graphics), 
which tend to be larger and less frequently modified files, optional 
pristines will likely be more beneficial.

Thanks,
Thomas Å.


On 22 mar 2012, at 08:15, Ashod Nakashian wrote:

 
 From: Daniel Shahaf danie...@elego.de
 To: Greg Stein gst...@gmail.com
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org
 Sent: Wednesday, March 21, 2012 2:08 PM
 Subject: Re: Compressed Pristines (Design Doc)

 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
 On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
 All,

 I'm happy to share[1] with you the design document for the Compressed 
 Pristines feature. The document is public and anyone can comment on any 
 part

 I can't.  Can you please move the document to our wiki, or dump it in an
 email to dev@, or on a pastebin, somewhere everyone canread it.

 I just opened it in an incognito window in Chrome. You should be able
 to access the thing.


 Tried, I get as far as the doc title.  I don't see its contents.


 Daniel (and all who can't access the doc),

 I'm attaching the PDF and ODT versions with updates based on Greg's 
 comments. I'd like to hear all opinions and comments. Google docs is a 
 fairly ideal environment for live commenting and editing, so it's too bad 
 that you can't access the file.

 Please let me know if you have any notes/comments on the design. If you'd 
 like to use the ODT file for comments and edits, please mark your input 
 clearly and I'll update the Google doc with your notes.

 Thanks,
 AshSubversionCompressedPristinesDesign.pdfSubversionCompressedPristinesDesign.odt






Re: Compressed Pristines (Design Doc)

2012-03-25 Thread Branko Čibej
On 22.03.2012 17:01, Branko Čibej wrote:
 On 22.03.2012 16:50, Daniel Shahaf wrote:
 Branko Čibej wrote on Thu, Mar 22, 2012 at 16:37:24 +0100:
 It's called SQLite.
 Heh.  I wondered whether I should mention that the server uses BDB to
 store pristine files.  (yes, the situation there is different in
 several relevant ways)
 To clarify: I'm /not/ advocating that we store each and every file into
 an SQLite BLOB. Files larger than several block sizes would be better
 off on disk as real files (the compressor can, e.g., buffer compressed
 contents up to, say, 32k, and if they become larger, spill directly into
 a file; otherwise, dump into a BLOB). If we don't care about shared
 pristine store, we don't even need a separate database, these blobs can
 go into wc.db (which, as Greg points out, also serves as an index).


Since we need a few datapoints, I made a quick test to see what kind of
space savings we can get with SQLite. Note that I've not tried any
auto-vacuum settings, because my test only does insertions.

I used a checkout of the current HTTPD trunk for my data set, and
compressed all pristines, then moved them into a SQLite database
depending on size; first, all compressed files 8k or smaller, next, all
compressed files 32k or smaller. Note that my script does not prune
empty directories from the pristine fanout. Here's the log:

brane@zulu:~/src/httpd$ svn co http://svn.apache.org/repos/asf/httpd/httpd/trunk
[...]
 U   trunk
Checked out revision 1305001.
brane@zulu:~/src/httpd$ find trunk/.svn/pristine -type f | wc -l
3114
brane@zulu:~/src/httpd$ du -sh trunk/.svn/pristine/
 42Mtrunk/.svn/pristine/
time gzip `find ./trunk/.svn/pristine -name '*.svn-base'`

real0m14.569s
user0m1.282s
sys 0m0.747s
brane@zulu:~/src/httpd$ du -sh trunk/.svn/pristine/
 17Mtrunk/.svn/pristine/
brane@zulu:~/src/httpd$ find trunk/.svn/pristine -size -8k -type f | wc -l
2856
#
# N.B.: 8k max size per blob
#
brane@zulu:~/src/httpd$ time python pristine.py trunk/.svn/pristine/

real0m29.683s
user0m0.533s
sys 0m1.641s
brane@zulu:~/src/httpd$ du -sh trunk/.svn/pristine/
4.7Mtrunk/.svn/pristine/
brane@zulu:~/src/httpd$ ll trunk/.svn/pristine//pristine.db
-rw-r--r--  1 brane  staff  322560 Mar 25 12:43 ps/pristine.db
#
# N.B.: 32k max size per blob
#
brane@zulu:~/src/httpd$ time python pristine.py trunk/.svn/pristine/

real0m23.831s
user0m0.529s
sys 0m1.616s
brane@zulu:~/src/httpd$ du -sh trunk/.svn/pristine/
1.2Mtrunk/.svn/pristine/


The pristine.py script is attached.

Based on these observations, it's clear that the implementation should
proceed as follows:

Step 1: Just compress the pristine files, do not use any packing. This
gives a 60% decrease in disk usage in the HTTPD case, but even if the
decrease is only 30%, it's still worth the effort.

Step 2: Store small (for some definition of small) compressed pristine
files in a SQLite database. In the case of HTTPD, this gives an exter up
to 90% savings in disk usage, but this is a very specific test case and
it's hard to guess what kind of gain we'd get on average.

All in all, looking at these number, there's a /looong/ way to go before
we start playing with custom pack formats and compression of packed
similar files. I'm not at all sure we'll ever really need the potential
space savings of these methods, especially compared to the obvious risk
to WC stability that writing and testing such code obviously brings.

Anyway, it's certain that creating this packed format is /not/ the first
step to take.

-- Brane
import os, sys
import sqlite3

MAX_BLOB = 32768

class Pristine(object):
def __init__(self, database):
self.conn = sqlite3.connect(database, isolation_level = IMMEDIATE)
self.conn.text_factory = str
self.cursor = self.conn.cursor()
self.cursor.execute(PRAGMA page_size = 1024)
self.cursor.execute(PRAGMA encoding = 'UTF-8')

@classmethod
def create(cls, database):
if os.path.exists(database):
os.unlink(database)
db = cls(database)
db.cursor.execute(CREATE TABLE pristine (
digest CHAR(40) PRIMARY KEY,
contents BLOB))
db.conn.commit()
return db

def insert(self, filename):
digest = os.path.basename(filename).partition(.)[0]
contents = open(filename, 'rb').read()
self.cursor.execute(
INSERT INTO pristine (digest, contents) VALUES (?, ?),
[digest, contents])
db.conn.commit()
os.remove(filename)

if __name__ == __main__:
db = Pristine.create(os.path.join(sys.argv[1], pristine.db))
for dirpath, dirnames, filenames in os.walk(sys.argv[1]):
for name in filenames:
if not name.endswith(.svn-base.gz):
continue
filename = os.path.join(dirpath, name)
if os.stat(filename).st_size  MAX_BLOB:
continue

AW: Compressed Pristines (Design Doc)

2012-03-23 Thread Markus Schaber
Hi, Erik,


Von: Erik Huelsmann [mailto:ehu...@gmail.com] 
 To substantiate that claim, I took the pristines directory from my Subversion 
 working copy and did some experimenting. See results  below:

 $ ls -ls uncompressed-pristines/*/*.svn-base | awk '{ tot += $1; } END { 
print total size  tot; }'
total size: 188724

 $ cp -Rp uncompressed-pristines/ compressed-pristines
 $ gzip compressed-pristines/*/*.svn-base
 $ ls -ls compressed-pristines/*/*.svn-base.gz | awk '{ tot += $1; } END { 
print total size  tot; }'
total size: 52320

 $ cat compressed-pristines/*/*.svn-base.gz  combined-compressed-file

Are you sure you should not combine the uncompressed pristines, and compress 
them afterwards? AFAICS, one of the points of the proposal is to profit from 
the inter-file redundancies.

Mit freundlichen Grüßen

Markus Schaber
-- 
___
We software Automation.

3S-Smart Software Solutions GmbH
Markus Schaber | Entwicklung
Memminger Str. 151 | 87439 Kempten | Tel. +49-831-54031-0 | Fax +49-831-54031-50

Email: m.scha...@3s-software.com | Web: http://www.3s-software.com 
CoDeSys Internet-Forum: http://forum.3s-software.com

Geschäftsführer: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | 
Handelsregister: Kempten HRB 6186 | USt-IDNr.: DE 167014915


Re: Compressed Pristines (Design Doc)

2012-03-23 Thread Hyrum K Wright
On Wed, Mar 21, 2012 at 2:19 PM, Ashod Nakashian
ashodnakash...@yahoo.com wrote:
 All,

 I'm happy to share[1] with you the design document for the Compressed
 Pristines feature. The document is public and anyone can comment on any part
 (select, right-click and comment away). If you'd like to get *editing*
 permission, please email me and I'll add you to the list of editors.

 I'm sure there will be much to criticize and debate, I'd love to hear all
 input, but being pragmatic, I also would like to a) experiment and figure
 out the best approach in practice, backed with real data and consensus and
 b) to finish this feature rather than debate forever (it's been debated for
 almost a decade this December!).

 As such, what's not clear, I've left out or written TBD notes and at the
 same time I've already made experimental changes locally to have a more
 learned information rather than an academic design (this, not to mention
 reading 100s of dev-list mails). I made a serious attempt at specifying as
 much of the hard facts/reqs/goals as possible to narrow the scope and avoid
 feature-creep.

 I'd like to take this feature on a lightweight branch and start committing
 code and getting reviews (and contributions!!) while we finalize the design
 and decide on the details (those who can create branches and grant commit
 rights please let me know when is the right time to do this - I'm ready and
 have code to commit and develop further).

 I thank everyone who will help us get this finally done in advance and look
 forward to hearing from you all.
 -Ash

 [1] https://docs.google.com/document/d/1ktIsewfMBMVBxbn-Ng8NwkNwAS_QJ6eC7GOygsbBeEc/edit

So, I've read through the design document, and the various threads,
and have a couple of comments / questions which I don't think have
been addressed.  My first impression, though is to give you major
kudos for going through the effort to research and think about this
complex and subtle problem.  Now my thoughts...

As mentioned elsewhere, I too was surprised by the choice of a custom
container, though I think you make a good argument for it.  One
simplification I was thinking about is this: what if the container
only needed to support add and batch-delete operations?  These are the
current contraints of the existing pristine store; would they
introduce additional simplicity into your design?

In some respects, it looks like you're solving *two* problems:
compression and the internal fragmentation due to large FS block
sizes.  How orthogonal are the problems?  Could they be solved
independently of each other in some way?  I know that compression
exposes the internal fragmentation issue, but used alone it certainly
doesn't make things *worse* does it?

Finally, in all the above let's not let the perfect be the enemy of
the good.  If something *simple* will give us demonstrable
performance improvements now, can we do so without limiting out
ability to do a more complex and complete solution later?

Anyway, good work, and here's hoping it yield fruit.

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Daniel Shahaf
Thanks Ash!  I'm in the middle of something right now, but I'll read it
once I'm done.

Ashod Nakashian wrote on Thu, Mar 22, 2012 at 00:15:21 -0700:
 
  From: Daniel Shahaf danie...@elego.de
 To: Greg Stein gst...@gmail.com 
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
 Sent: Wednesday, March 21, 2012 2:08 PM
 Subject: Re: Compressed Pristines (Design Doc)
  
 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
  On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
   Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
   All,
  
   I'm happy to share[1] with you the design document for the Compressed 
   Pristines feature. The document is public and anyone can comment on any 
   part
  
   I can't.  Can you please move the document to our wiki, or dump it in an
   email to dev@, or on a pastebin, somewhere everyone canread it.
  
  I just opened it in an incognito window in Chrome. You should be able
  to access the thing.
  
 
 Tried, I get as far as the doc title.  I don't see its contents.
 
 
 Daniel (and all who can't access the doc),
 
 I'm attaching the PDF and ODT versions with updates based on Greg's comments. 
 I'd like to hear all opinions and comments. Google docs is a fairly ideal 
 environment for live commenting and editing, so it's too bad that you can't 
 access the file.
 
 Please let me know if you have any notes/comments on the design. If you'd 
 like to use the ODT file for comments and edits, please mark your input 
 clearly and I'll update the Google doc with your notes.
 
 Thanks,
 Ash




Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Daniel Shahaf
OK, I've had a cruise through now.

First of all I have to say it's an order of magnitude larger than what
I'd imagined it would be.  That makes the move it elsewhere idea I'd
had less practical than I'd predicted.  I'm also not intending to take
you up on your offer to proxy me to the doc, though thanks for making it.

Design-wise I'm a bit surprised that the choice ended up being rolling
a custom file format.

Thanks for your work.

Cheers,

Daniel

Ashod Nakashian wrote on Thu, Mar 22, 2012 at 00:15:21 -0700:
 
  From: Daniel Shahaf danie...@elego.de
 To: Greg Stein gst...@gmail.com 
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
 Sent: Wednesday, March 21, 2012 2:08 PM
 Subject: Re: Compressed Pristines (Design Doc)
  
 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
  On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
   Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
   All,
  
   I'm happy to share[1] with you the design document for the Compressed 
   Pristines feature. The document is public and anyone can comment on any 
   part
  
   I can't.  Can you please move the document to our wiki, or dump it in an
   email to dev@, or on a pastebin, somewhere everyone canread it.
  
  I just opened it in an incognito window in Chrome. You should be able
  to access the thing.
  
 
 Tried, I get as far as the doc title.  I don't see its contents.
 
 
 Daniel (and all who can't access the doc),
 
 I'm attaching the PDF and ODT versions with updates based on Greg's comments. 
 I'd like to hear all opinions and comments. Google docs is a fairly ideal 
 environment for live commenting and editing, so it's too bad that you can't 
 access the file.
 
 Please let me know if you have any notes/comments on the design. If you'd 
 like to use the ODT file for comments and edits, please mark your input 
 clearly and I'll update the Google doc with your notes.
 
 Thanks,
 Ash





Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Ashod Nakashian

 From: Daniel Shahaf danie...@elego.de
To: Ashod Nakashian ashodnakash...@yahoo.com 
Cc: dev@subversion.apache.org dev@subversion.apache.org 
Sent: Thursday, March 22, 2012 7:30 AM
Subject: Re: Compressed Pristines (Design Doc)
 
OK, I've had a cruise through now.

First of all I have to say it's an order of magnitude larger than what
I'd imagined it would be.  That makes the move it elsewhere idea I'd
had less practical than I'd predicted.  I'm also not intending to take
you up on your offer to proxy me to the doc, though thanks for making it.

If there are any ideas for simplifying things, I think it's well worth the 
effort. I for one am not for unecessary complexity. This is why I took the time 
to outline a set of requirements. If the requirements are excessive, let's 
simply them first. And based on the requirements alone can one justify the 
design.


Design-wise I'm a bit surprised that the choice ended up being rolling
a custom file format.

Personally I know not of any library that can deliver the requirements that we 
need (outlined in the doc). Again, if the requirements are in question, let's 
simplify them. If there is such a library, suggesting it will save us a lot of 
time and effort. Otherwise, using a Tar-like container will just not cut it. On 
the other hand, the proposed custom format is rather simple and its code 
shouldn't be complex. In fact, I suspect Tar is more complex (considering it 
must store more information than we do).


-Ash


Thanks for your work.

Cheers,

Daniel

Ashod Nakashian wrote on Thu, Mar 22, 2012 at 00:15:21 -0700:
 
  From: Daniel Shahaf danie...@elego.de
 To: Greg Stein gst...@gmail.com 
 Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
 Sent: Wednesday, March 21, 2012 2:08 PM
 Subject: Re: Compressed Pristines (Design Doc)
  
 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
  On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
   Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
   All,
  
   I'm happy to share[1] with you the design document for the Compressed 
   Pristines feature. The document is public and anyone can comment on 
   any part
  
   I can't.  Can you please move the document to our wiki, or dump it in an
   email to dev@, or on a pastebin, somewhere everyone canread it.
  
  I just opened it in an incognito window in Chrome. You should be able
  to access the thing.
  
 
 Tried, I get as far as the doc title.  I don't see its contents.
 
 
 Daniel (and all who can't access the doc),
 
 I'm attaching the PDF and ODT versions with updates based on Greg's 
 comments. I'd like to hear all opinions and comments. Google docs is a 
 fairly ideal environment for live commenting and editing, so it's too bad 
 that you can't access the file.
 
 Please let me know if you have any notes/comments on the design. If you'd 
 like to use the ODT file for comments and edits, please mark your input 
 clearly and I'll update the Google doc with your notes.
 
 Thanks,
 Ash








Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Mark Phippard
On Thu, Mar 22, 2012 at 11:18 AM, Ashod Nakashian
ashodnakash...@yahoo.com wrote:
Design-wise I'm a bit surprised that the choice ended up being rolling
a custom file format.

 Personally I know not of any library that can deliver the requirements that 
 we need (outlined in the doc). Again, if the requirements
 are in question, let's simplify them. If there is such a library, suggesting 
 it will save us a lot of time and effort. Otherwise, using a
 Tar-like container will just not cut it. On the other hand, the proposed 
 custom format is rather simple and its code shouldn't be
 complex. In fact, I suspect Tar is more complex (considering it must store 
 more information than we do).

I am not sure what Daniel meant, but I had always just assumed we
would simply compress the files in the existing pristines.  I think
your document does a nice job explaining why that is not good enough.
In that sense, I would also say that I was surprised by the choice of
a custom file format, but that does not mean I would question it.  I
think your document does a nice job in revealing some of the subtle
complexities of this feature.  That gives me more hope on progress
towards a solution.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Branko Čibej
On 22.03.2012 16:23, Mark Phippard wrote:
 On Thu, Mar 22, 2012 at 11:18 AM, Ashod Nakashian
 ashodnakash...@yahoo.com wrote:
 Design-wise I'm a bit surprised that the choice ended up being rolling
 a custom file format.
 Personally I know not of any library that can deliver the requirements that 
 we need (outlined in the doc). Again, if the requirements
 are in question, let's simplify them. If there is such a library, suggesting 
 it will save us a lot of time and effort. Otherwise, using a
 Tar-like container will just not cut it. On the other hand, the proposed 
 custom format is rather simple and its code shouldn't be
 complex. In fact, I suspect Tar is more complex (considering it must store 
 more information than we do).
 I am not sure what Daniel meant, but I had always just assumed we
 would simply compress the files in the existing pristines.  I think
 your document does a nice job explaining why that is not good enough.
 In that sense, I would also say that I was surprised by the choice of
 a custom file format, but that does not mean I would question it.  I
 think your document does a nice job in revealing some of the subtle
 complexities of this feature.  That gives me more hope on progress
 towards a solution.

I'd like to point out that there /is/ a library that handles storage,
lookup, access and deletion of many small files in a single large one
quite efficiently. Well tested, too, widely used, and configurable with
regard to space reclamation. Moreover, we're already using that library.
It's called SQLite.

-- Brane


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Daniel Shahaf
Ashod Nakashian wrote on Thu, Mar 22, 2012 at 08:18:40 -0700:
 
  From: Daniel Shahaf danie...@elego.de
 To: Ashod Nakashian ashodnakash...@yahoo.com 
 Cc: dev@subversion.apache.org dev@subversion.apache.org 
 Sent: Thursday, March 22, 2012 7:30 AM
 Subject: Re: Compressed Pristines (Design Doc)
  
 OK, I've had a cruise through now.
 
 First of all I have to say it's an order of magnitude larger than what
 I'd imagined it would be.  That makes the move it elsewhere idea I'd
 had less practical than I'd predicted.  I'm also not intending to take
 you up on your offer to proxy me to the doc, though thanks for making it.
 
 If there are any ideas for simplifying things, I think it's well worth
 the effort. I for one am not for unecessary complexity. This is why
 I took the time to outline a set of requirements. If the requirements
 are excessive, let's simply them first. And based on the requirements
 alone can one justify the design.
 

Fair enough.

One requirement is extensibility (features in 1.9 timeframe, assuming
your design is released in 1.8).  I see you included a format number,
but --- for example --- perhaps the index entries should contain a few
RESERVED bytes too?  (It would have help a lot in manually fixing FSFS
corruptions if we'd left a few unused bytes here and there in revision
files...)

Another requirement is concurrency.  ra_serf downloads files
concurrently, and the editor (svn_delta_editor_t, 1.8's svn_editor_t)
allows retrieving the text of multiple files concurrently.  Does your
design allow for adding two new pristines with their contents arriving
interleaved?  (There is one thread in the client process, but several
TCP sockets.)

 
 Design-wise I'm a bit surprised that the choice ended up being rolling
 a custom file format.
 
 Personally I know not of any library that can deliver the requirements
 that we need (outlined in the doc). Again, if the requirements are in

I'm not familiar offhand with such a library either, but perhaps someone
else on list is.

 question, let's simplify them. If there is such a library, suggesting
 it will save us a lot of time and effort. Otherwise, using a Tar-like
 container will just not cut it. On the other hand, the proposed custom
 format is rather simple and its code shouldn't be complex. In fact,
 I suspect Tar is more complex (considering it must store more
 information than we do).
 

Let's see how far we can get with the custom format.  If the someone
invented that wheel already factor pops up too often I'm sure we'll
notice.


Cheers,

Daniel

 
 -Ash
 
 
 Thanks for your work.
 
 Cheers,
 
 Daniel
 
 Ashod Nakashian wrote on Thu, Mar 22, 2012 at 00:15:21 -0700:
  
   From: Daniel Shahaf danie...@elego.de
  To: Greg Stein gst...@gmail.com 
  Cc: Ashod Nakashian ashodnakash...@yahoo.com; dev@subversion.apache.org 
  Sent: Wednesday, March 21, 2012 2:08 PM
  Subject: Re: Compressed Pristines (Design Doc)
   
  Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
   On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name 
   wrote:
Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
All,
   
I'm happy to share[1] with you the design document for the 
Compressed Pristines feature. The document is public and anyone can 
comment on any part
   
I can't.  Can you please move the document to our wiki, or dump it in 
an
email to dev@, or on a pastebin, somewhere everyone canread it.
   
   I just opened it in an incognito window in Chrome. You should be able
   to access the thing.
   
  
  Tried, I get as far as the doc title.  I don't see its contents.
  
  
  Daniel (and all who can't access the doc),
  
  I'm attaching the PDF and ODT versions with updates based on Greg's 
  comments. I'd like to hear all opinions and comments. Google docs is a 
  fairly ideal environment for live commenting and editing, so it's too bad 
  that you can't access the file.
  
  Please let me know if you have any notes/comments on the design. If you'd 
  like to use the ODT file for comments and edits, please mark your input 
  clearly and I'll update the Google doc with your notes.
  
  Thanks,
  Ash
 
 
 
 
 
 


AW: Compressed Pristines (Design Doc)

2012-03-22 Thread Markus Schaber
Hi,

I just want to shed light on three arguments against a new custom archive 
format.

Compressing the files using a standard format (like gz or xz) file-by-file has 
the advantage of better debuggability. Developers can easily (de)compress or 
otherwise those files using their standard utilities when trying to debug 
problems. Using a custom format always makes that process more difficult.

In addition, increasingly more file systems support features like 
block_suballocation, tail packing or tail merging[1]. This drastically reduces 
the space loss due to files being smaller than the block size.

And the third argument is the simplicity of implementation. Just checking 
.svn/pristines/ab/abcd.gz in addition to .svn/pristines/ab/abcd when searching 
for a pristine file is much easier to implement.

I'm not opposed in general to storing pristines in an archive, but the 
disadvantages should be weighted in when making the decision.


A different, somehow related idea is a common pristine store somewhere in the 
users directory, shared by several working copies. Especially when checking out 
several working copies of the same project (or similar branches), this could 
save a lot of network traffic.


Best regards

Markus Schaber
[1]: 
http://msdn.microsoft.com/en-us/library/windows/desktop/ee681827%28v=vs.85%29.aspx
 claims tail packing support for NTFS. 
http://en.wikipedia.org/wiki/Block_suballocation claims support for Btrfs, 
ReiserFS, Reiser4, FreeBSD UFS2. And AFAIR, XFS has a similar feature.

-- 
___
We software Automation.

3S-Smart Software Solutions GmbH
Markus Schaber | Developer
Memminger Str. 151 | 87439 Kempten | Germany | Tel. +49-831-54031-0 | Fax 
+49-831-54031-50

Email: m.scha...@3s-software.com | Web: http://www.3s-software.com 
CoDeSys internet forum: http://forum.3s-software.com
Download CoDeSys sample projects: 
http://www.3s-software.com/index.shtml?sample_projects

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade 
register: Kempten HRB 6186 | Tax ID No.: DE 167014915 
-Ursprüngliche Nachricht-
Von: Mark Phippard [mailto:markp...@gmail.com] 
Gesendet: Donnerstag, 22. März 2012 16:23
An: Ashod Nakashian
Cc: Daniel Shahaf; dev@subversion.apache.org
Betreff: Re: Compressed Pristines (Design Doc)

On Thu, Mar 22, 2012 at 11:18 AM, Ashod Nakashian ashodnakash...@yahoo.com 
wrote:
Design-wise I'm a bit surprised that the choice ended up being rolling 
a custom file format.

 Personally I know not of any library that can deliver the requirements 
 that we need (outlined in the doc). Again, if the requirements are in 
 question, let's simplify them. If there is such a library, suggesting 
 it will save us a lot of time and effort. Otherwise, using a Tar-like 
 container will just not cut it. On the other hand, the proposed custom format 
 is rather simple and its code shouldn't be complex. In fact, I suspect Tar is 
 more complex (considering it must store more information than we do).

I am not sure what Daniel meant, but I had always just assumed we would simply 
compress the files in the existing pristines.  I think your document does a 
nice job explaining why that is not good enough.
In that sense, I would also say that I was surprised by the choice of a custom 
file format, but that does not mean I would question it.  I think your document 
does a nice job in revealing some of the subtle complexities of this feature.  
That gives me more hope on progress towards a solution.

--
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Daniel Shahaf
Branko Čibej wrote on Thu, Mar 22, 2012 at 16:37:24 +0100:
 It's called SQLite.

Heh.  I wondered whether I should mention that the server uses BDB to
store pristine files.  (yes, the situation there is different in
several relevant ways)


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Branko Čibej
On 22.03.2012 16:50, Daniel Shahaf wrote:
 Branko Čibej wrote on Thu, Mar 22, 2012 at 16:37:24 +0100:
 It's called SQLite.
 Heh.  I wondered whether I should mention that the server uses BDB to
 store pristine files.  (yes, the situation there is different in
 several relevant ways)

To clarify: I'm /not/ advocating that we store each and every file into
an SQLite BLOB. Files larger than several block sizes would be better
off on disk as real files (the compressor can, e.g., buffer compressed
contents up to, say, 32k, and if they become larger, spill directly into
a file; otherwise, dump into a BLOB). If we don't care about shared
pristine store, we don't even need a separate database, these blobs can
go into wc.db (which, as Greg points out, also serves as an index).

-- Brane


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Ivan Zhakov
On Thu, Mar 22, 2012 at 18:30, Daniel Shahaf danie...@elego.de wrote:
 OK, I've had a cruise through now.

 First of all I have to say it's an order of magnitude larger than what
 I'd imagined it would be.  That makes the move it elsewhere idea I'd
 had less practical than I'd predicted.  I'm also not intending to take
 you up on your offer to proxy me to the doc, though thanks for making it.

 Design-wise I'm a bit surprised that the choice ended up being rolling
 a custom file format.

 Thanks for your work.

+1. I believe we should implement compressed pristine in simple way:
just compress pristine files itself, without inventing some new
format.


-- 
Ivan Zhakov


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Erik Huelsmann
Hi Ash,

Thanks for picking up the initiative to implement this feature.

On Thu, Mar 22, 2012 at 7:01 PM, Ivan Zhakov i...@visualsvn.com wrote:

 On Thu, Mar 22, 2012 at 18:30, Daniel Shahaf danie...@elego.de wrote:
  OK, I've had a cruise through now.
 
  First of all I have to say it's an order of magnitude larger than what
  I'd imagined it would be.  That makes the move it elsewhere idea I'd
  had less practical than I'd predicted.  I'm also not intending to take
  you up on your offer to proxy me to the doc, though thanks for making it.
 
  Design-wise I'm a bit surprised that the choice ended up being rolling
  a custom file format.
 
  Thanks for your work.
 
 +1. I believe we should implement compressed pristine in simple way:
 just compress pristine files itself, without inventing some new
 format.


As the others, I'm surprised we seem to be going with a custom file format.
You claim source files are generally small in size and hence only small
benefits can be had from compressing them, if at all, due to the fact that
they would be of sub-block size already.

To substantiate that claim, I took the pristines directory from my
Subversion working copy and did some experimenting. See results  below:

 $ ls -ls uncompressed-pristines/*/*.svn-base | awk '{ tot += $1; } END {
print total size  tot; }'
total size: 188724

 $ cp -Rp uncompressed-pristines/ compressed-pristines
 $ gzip compressed-pristines/*/*.svn-base
 $ ls -ls compressed-pristines/*/*.svn-base.gz | awk '{ tot += $1; } END {
print total size  tot; }'
total size: 52320

 $ cat compressed-pristines/*/*.svn-base.gz  combined-compressed-file
 $ ls -ls combined-compressed-file
41812 


So, if I look at the Subversion pristines in my working copy, the reduction
in allocated blocks goes from 100% to 27%. To be honest, I doubt the
complexity we'll be importing just to reduce the allocated number of blocks
from 27% to 22% is really worth it: the savings are already tremendous.
Won't the creation of a custom storage format just serve to destabilize our
working copy?


Do you have data which triggered you to design this custom format?


Bye,


Erik.


Re: Compressed Pristines (Design Doc)

2012-03-22 Thread Philip Martin
Erik Huelsmann ehu...@gmail.com writes:

 As the others, I'm surprised we seem to be going with a custom file format.
 You claim source files are generally small in size and hence only small
 benefits can be had from compressing them, if at all, due to the fact that
 they would be of sub-block size already.

I was surprised too, so I looked at GCC where a trunk checkout has
75,000 files of various types:

$ find .svn/pristine -type f | wc -l
75192

Uncompressed:

$ du -hs .svn/pristine
635M.svn/pristine
$ find .svn/pristine -type f | xargs ls -ls | awk '{tot += $1} END {print tot}'
641536

Individually compressed is smaller by a factor of 2:

$ find .svn/pristine -type f | xargs gzip
$ du -hs .svn/pristine
367M.svn/pristine
$ find .svn/pristine -type f | xargs ls -ls | awk '{tot += $1} END {print tot}'
365624

As one single file is smaller by another factor of 3:

$ find .svn/pristine -type f | xargs cat  one-big-file
$ du -hs one-big-file
122Mone-big-file
$ ls -ls one-big-file | awk '{print $1}'
124516

When individually compressed most of the 75,000 files are less
than 4K:

$ find .svn/pristine -size -4096c | wc -l
71571

more than half are less than 1K:

$ find .svn/pristine -size -1024c | wc -l
53707

and nearly half are less than 0.5K:

$ find .svn/pristine -size -512c | wc -l
36521

In the uncompressed state:

62323 are less than 4K
36648 are less than 1K
21828 are less than 0.5K

Maybe GCC is not typical but, rather to my surprise, combining the
compressed files would be a significant improvement.

I also have an httpd trunk checkout (needs cleanup so bigger than
normal):

90M uncompressed
37M individually compressed
23M as one big file

That's more like your figures for Subversion where the major step is
individual compression.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com


Re: Compressed Pristines (Design Doc)

2012-03-21 Thread Ashod Nakashian
All,

I'm happy to share[1] with you the design document for the Compressed Pristines 
feature. The document is public and anyone can comment on any part (select, 
right-click and comment away). If you'd like to get *editing* permission, 
please email me and I'll add you to the list of editors.

I'm sure there will be much to criticize and debate, I'd love to hear all 
input, but being pragmatic, I also would like to a) experiment and figure out 
the best approach in practice, backed with real data and consensus and b) to 
finish this feature rather than debate forever (it's been debated for almost a 
decade this December!).


As such, what's not clear, I've left out or written TBD notes and at the same 
time I've already made experimental changes locally to have a more learned 
information rather than an academic design (this, not to mention reading 100s 
of dev-list mails). I made a serious attempt at specifying as much of the hard 
facts/reqs/goals as possible to narrow the scope and avoid feature-creep.

I'd like to take this feature on a lightweight branch and start committing 
code and getting reviews (and contributions!!) while we finalize the design and 
decide on the details (those who can create branches and grant commit rights 
please let me know when is the right time to do this - I'm ready and have code 
to commit and develop further). 


I thank everyone who will help us get this finally done in advance and look 
forward to hearing from you all.
-Ash

[1] https://docs.google.com/document/d/1ktIsewfMBMVBxbn-Ng8NwkNwAS_QJ6eC7GOygsbBeEc/edit



 From: Hyrum K Wright hyrum.wri...@wandisco.com
To: Ashod Nakashian ashodnakash...@yahoo.com 
Cc: Philip Martin philip.mar...@wandisco.com; Greg Stein gst...@gmail.com; 
dev@subversion.apache.org dev@subversion.apache.org 
Sent: Monday, March 12, 2012 5:31 PM
Subject: Re: Compressed Pristines
 
On Mon, Mar 12, 2012 at 7:11 AM, Ashod Nakashian
ashodnakash...@yahoo.com wrote:
 - Original Message -

 From: Philip Martin philip.mar...@wandisco.com
 To: Ashod Nakashian ashodnakash...@yahoo.com
 Cc: Greg Stein gst...@gmail.com; dev@subversion.apache.org 
 dev@subversion.apache.org
 Sent: Monday, March 12, 2012 2:40 PM
 Subject: Re: Compressed Pristines

 Ashod Nakashian ashodnakash...@yahoo.com writes:

  * Are there any documentation/design/discussions on this feature that
  I could study?

 There has been some discussion in the past on the dev list.  I don't
 think it is written down anywhere else.

  * Who should coordinate and be contacted on decision points?

 The dev list.

  * I know this feature was planned for 1.8. Is that still reasonable?
  (I can't find a release date for 1.8) Will 1.8 wait for this or has
  this feature been demoted to a low-priority in general? The reason I
  ask is to have a vague idea as to how close this feature is on the
  critical path to future releases.

 We don't plan like that.  The features that will go into 1.8 are the
 features people choose to write.

 Got it. Thanks.

 Here is my plan. I'll study whatever discussion took place on dev-list. My 
 main technical concern is related to any complications that aren't obvious 
 (due to design or requirements elsewhere in svn). I'll come up with an 
 architectural overview for the feature design and a breakdown of major 
 milestones. When ready, I'll share them on this list and open it for 
 discussion. Based on the results, code changes can commence.

A few observations based upon my past poking around this area.

The current implementation of the pristine store is designed to be
streamy.  That is, external users aren't supposed to know or care
where the actual contents live, or how they are accessed, but simply
get a stream from which they can read and write the contents.  In
principle, this should make compressing said contents relatively easy,
as we could just insert a compressing stream in this pipeline and
everything would automagically work.  Since this hasn't yet happened,
you can probably guess that it isn't as easy as that. :)

The primary issue when I looked at this problem was that the streamy
abstraction is broken in several places, such as when we install the
new pristine file.  There are also certain consumers, such as a
external diff tools, that require an uncompressed on-disk file to
operate on, and we currently just provide the pristine as that file.
Compressed pristines would require recreating the uncompressed version
when such a tool is invoked.  Whether this is a useful tradeoff, I
don't know.

Generally, though, I'm +1 for compressed pristines, as that was one of
the design goals of wc-ng in the first place.  (Oh, and extra points
for selectively compressing pristines based upon mime-type.)

So start digging in, asking on the dev@ list and #svn-dev on Freenode,
and sending in patches.  You'll find a community of folks eager for
your input, and willing to help you.

Hope that helps,
-Hyrum


-- 

uberSVN: Apache 

Re: Compressed Pristines (Design Doc)

2012-03-21 Thread Daniel Shahaf
Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
 All,
 
 I'm happy to share[1] with you the design document for the Compressed 
 Pristines feature. The document is public and anyone can comment on any part

I can't.  Can you please move the document to our wiki, or dump it in an
email to dev@, or on a pastebin, somewhere everyone canread it.

Thanks


Re: Compressed Pristines (Design Doc)

2012-03-21 Thread Greg Stein
On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
 All,

 I'm happy to share[1] with you the design document for the Compressed 
 Pristines feature. The document is public and anyone can comment on any part

 I can't.  Can you please move the document to our wiki, or dump it in an
 email to dev@, or on a pastebin, somewhere everyone canread it.

I just opened it in an incognito window in Chrome. You should be able
to access the thing.

Cheers,
-g


Re: Compressed Pristines (Design Doc)

2012-03-21 Thread Daniel Shahaf
Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
 On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name wrote:
  Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
  All,
 
  I'm happy to share[1] with you the design document for the Compressed 
  Pristines feature. The document is public and anyone can comment on any 
  part
 
  I can't.  Can you please move the document to our wiki, or dump it in an
  email to dev@, or on a pastebin, somewhere everyone canread it.
 
 I just opened it in an incognito window in Chrome. You should be able
 to access the thing.
 

Tried, I get as far as the doc title.  I don't see its contents.


Re: Compressed Pristines (Design Doc)

2012-03-21 Thread Branko Čibej
On 21.03.2012 22:08, Daniel Shahaf wrote:
 Greg Stein wrote on Wed, Mar 21, 2012 at 16:51:47 -0400:
 On Wed, Mar 21, 2012 at 16:11, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Ashod Nakashian wrote on Wed, Mar 21, 2012 at 12:19:02 -0700:
 All,

 I'm happy to share[1] with you the design document for the Compressed 
 Pristines feature. The document is public and anyone can comment on any 
 part
 I can't.  Can you please move the document to our wiki, or dump it in an
 email to dev@, or on a pastebin, somewhere everyone canread it.
 I just opened it in an incognito window in Chrome. You should be able
 to access the thing.

 Tried, I get as far as the doc title.  I don't see its contents.

Enable javascript in your browser. Sorry.

-- Brane