[Bug-tar] Bug? Where? Why? Why so many files changing as we read them?
This may not be a tar bug, but I am trying to figure out what is going on and/or why so many 'changes'... using tar V1.28 on linux-4.1.0 Reading from a Win7-SP1 workstation via 'CIFS' (mounted fs) trying to do: tar cf ~/Appdata.tar --acls --xattrs Appdata In my "Windows Home directory" as seen as a mounted file system via CIFS. //Athenae/C/ /athenae/ cifs user,noauto,rw,uid=x,gid=x,nocase,serverino,creds=xxx,setuids 0 0 Trying to run tar, as such: tar cf ~/Appdata.tar --acls --xattrs Appdata And got **78** lines (out of 64389 files in 12357 directories) that claim "something changed". Note: I wasn't running any of the below related apps as tar ran: tar: Appdata/Local/Google/Chrome/User Data/Default/Cache: file changed as we read it tar: Appdata/Local/Google/Chrome/User Data/Default: file changed as we read it tar: Appdata/Local/Google/Chrome/User Data: file changed as we read it tar: Appdata/Local/Microsoft/ehome/Art Cache: file changed as we read it tar: Appdata/Local/Microsoft/Media Player/Art Cache/LocalMLS: file changed as we read it tar: Appdata/Local/Microsoft/OneNote/12.0/Backup/Personal Notebook: file changed as we read it tar: Appdata/Local/Microsoft/OneNote/12.0/Backup: file changed as we read it ... tar: Exiting with failure status due to previous errors Of minor note, "find" commands run locally via cygwin and on linux (via CIFS) both show the same # files+dirs (76746) -- showing a 1:1 mapping. I did an "ls -l" from both sides (cyg v. linux+CIFS) that shows the last mod times -- Looked at "Appdata/Local/Microsoft/OneNote/12.0/Backup" and the subdir "Personal Notebook": Oct 15 2012 Backup Mar 18 2015 Personal Notebook I.e. neither looks like they changed recently. So why the messages? Thanks!
Re: [Bug-tar] Regarding untaring tar files containing symlinks
Ah I have my cygwin cygdrive path prefix set to / mount -p Prefix Type Flags / user binmode You should be able to set it once: mount -c / I also have my cygwin installed in C:\ not C:\cygwin. With those two changes, my cygwin and windows paths are identical (except for backslash or forward slash). Forgot about that -- have had it in root for over 10 years...since I first installed cygwin. FWIW -- I **might** call this a minor bug (or feature deficit). If you ahve native mode set for the symlinks, then, I would argue that it shouldn't insert a cygwin/c in the path... Windows won't let you create a symlink to a non-existent path (you can on unix or linux), so when it fails for any reason -- cygwin falls back to storing the link in a file.
Re: [Bug-tar] Regarding untaring tar files containing symlinks
Amit Kapila wrote Please extract in order as mentioned to ensure that symlink folder gets extracted first. Please check contents of C:\Data\pg_tblspc, ideally it should contain symlink to C:\tbs. --- It appears to work as you are wanting it to (console output below). But I also note that if they are created in the wrong order (something that would work on linux). -- then I get the behavior you mention: instead of a symlink, I get a file containing text: C:\data\pg_tblspchexdump -C 16386 21 3c 73 79 6d 6c 69 6e 6b 3e ff fe 63 00 3a 00 |!symlink..c.:.| 0010 5c 00 74 00 62 00 73 00 00 00|\.t.b.s...| 001a Working case follows: mkdir tbs / cd tbs /tbs tar xvaf /tmp/tars/16386.tar PG_9.5_201406292/ PG_9.5_201406292/12135/ PG_9.5_201406292/12135/16387 /tbs ll -R .: total 0 drwxrwxr-x+ 1 0 Jul 4 21:47 PG_9.5_201406292/ ./PG_9.5_201406292: total 0 drwxrwxr-x+ 1 0 Jul 4 21:47 12135/ ./PG_9.5_201406292/12135: total 0 -rw-rw-r-- 1 0 Jul 4 21:47 16387 /tbs cd .. / mkdir data / cd data /data tar xaf /tmp/tars/base.tar /data ll total 47 -rw-rw-r--+ 1 4 Jul 4 00:01 PG_VERSION -rw---+ 1 206 Jul 4 21:47 backup_label drwxrwxr-x+ 1 0 Jul 4 00:02 base/ drwxrwxr-x+ 1 0 Jul 5 00:18 global/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_clog/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_dynshmem/ -rw-rw-r--+ 1 4352 Jul 4 00:03 pg_hba.conf -rw-rw-r--+ 1 1678 Jul 4 00:01 pg_ident.conf drwxrwxr-x+ 1 0 Jul 4 00:01 pg_llog/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_multixact/ drwxrwxr-x+ 1 0 Jul 4 21:45 pg_notify/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_replslot/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_serial/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_snapshots/ drwxrwxr-x+ 1 0 Jul 4 21:45 pg_stat/ drwxrwxr-x+ 1 0 Jul 4 21:47 pg_stat_tmp/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_subtrans/ drwxrwxr-x+ 1 0 Jul 4 21:46 pg_tblspc/ drwxrwxr-x+ 1 0 Jul 4 00:01 pg_twophase/ drwxrwxr-x+ 1 0 Jul 4 00:17 pg_xlog/ -rw-rw-r--+ 190 Jul 4 00:01 postgresql.auto.conf -rw-rw-r--+ 1 21994 Jul 4 00:05 postgresql.conf /data ll pg_tblspc/ total 0 lrwxrwxrwx 1 4 Jul 4 21:46 16386 - /tbs/ /data cd pg_tblspc/ /data/pg_tblspc ls 16386@ /data/pg_tblspc cd 16386/ /data/pg_tblspc/16386 ls PG_9.5_201406292/ /data/pg_tblspc/16386 cd .. /data/pg_tblspc cmd Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\data\pg_tblspcdir Volume in drive C is System Disk Volume Serial Number is E889-68E4 Directory of C:\data\pg_tblspc 07/04/2014 09:46 PMDIR . 07/04/2014 09:46 PMDIR .. 07/04/2014 09:46 PMSYMLINKD 16386 [C:\tbs] 0 File(s) 0 bytes 3 Dir(s) 218,077,290,496 bytes free C:\data\pg_tblspc - This might have to do with the version of cygwin you have. You might make sure your cygwin is up to date -- as I vaguely remember somewhat older versions of 'cygwin' didn't create actual windows symlinks and hardlinks but created dummy files with the information in them (like you are seeing). But that changed... maybe 2-4 years ago? If you are running on a 64-bit windows, there is also a 64-bit cygwin that runs in native mode that gives some advantages over 32-bit cygwin on a 64-bit machine (like 32-bit processes can't see the real C:\windows\system32, but are redirected to C:\windows\syswow64 syswow64 = windows[32] on windows64) you can tell which version you are running by running uname -a: CYGWIN_NT-6.1 Athenae 1.7.30(0.272/5/3) 2014-05-23 10:36 x86_64 Cygwin 32-bit cygwin would say: CYGWIN_NT-6.1-WOW64 Athenae 1.7.28(0.271/5/3) 2014-02-09 21:06 i686 Cygwin
Re: [Bug-tar] Regarding untaring tar files containing symlinks
Amit Kapila wrote: On Mon, Jun 30, 2014 at 8:10 PM, Joerg Schilling joerg.schill...@fokus.fraunhofer.de mailto:joerg.schill...@fokus.fraunhofer.de wrote: Amit Kapila amit.kapil...@gmail.com mailto:amit.kapil...@gmail.com wrote: On Mon, Jun 30, 2014 at 6:42 PM, Joerg Schilling joerg.schill...@fokus.fraunhofer.de mailto:joerg.schill...@fokus.fraunhofer.de wrote: It will create copies as Win-DOS does not support symlnks. Maybe not, but Windows has since Vista. Maybe you should give up Win-DOS? For my usecase, I need it to maintain symlinks even after it gets untarred. I have noticed that WinRar is able to maintain symlinks after extraction. I am not sure how this should work, given the fact that Win-DOS does not correctly support symlinks. Internally the software uses junction points as mentioned in below link to create symlinks. http://www.codeproject.com/KB/winsdk/junctionpoints.aspx Technically, it should use 'mklink' which are windows symlinks. Junctions are similar but are used to do mounts of file systems. For most practical purposes, they should work 'fine'... Cygwin walks all over Window's mountpoints and refuses to honor them as mountpoint, so untaring things under cygwin will overwrite mount points... the mountd/linkd (types of junctions) are honored in one case in cygwin, but not in the other. Cygwin will use the native windows symlinks when requested(an ENV var), to create symlinks. Given that Windows (What is win-dos? never heard of such -- I hope you don't believe windows = dos -- as that'd be really wrong) is windows, and it has symlinks, I don't see how you can claim windows symlinks don't work. If you want to claim they are not POSIX, that's a bit disingenuous as Windows isn't POSIX. Best bet might be to use cygwin to untar your files... it can create win-symlinks. As for making multiple copies of files when untarring them? That's not the same as creating links. My font dir is 13.5 Meg if the links are ignored while only ~4-5M if linked. (actually use hardlinks... which, BTW, windows also has).
Re: [Bug-tar] adding ACLs when there are none
Pavel Raiskup wrote: Or could you give an example? What *exactly* do you expect the --acls should behave by default? Combine existing acls in parent directory (default acls) with the stored in archive? Thanks, Pavel - Sorry, didn't finish that thought...If there are default acls set in the tar, they would replace such default acls that are present, but undefined ACLs in the tar wouldn't overwrite set acl's that propagate from the parent.
Re: [Bug-tar] adding ACLs when there are none
Pavel Raiskup wrote: When --acls option is on (regardless of tarball contents or tarball format), we should explicitly set OR delete default ACLs for extracted directories. Prior to this update, we always created arbitrary default ACLs based standard file permissions. Why would tar create any acls if there are none in the source tar? I saw someone else have a similar complaint about acls being created when the tar didn't have acls but the --acls option was used. I wouldn't want a non-acl containing tarball to overwrite or change default acls in a directory that already exists. I wouldn't want acl=undef to overwrite a set value. I just was working on a parameter passing routine -- and had to think -- did I really want undef (in perl) to overwrite a defined value? If I said --acl=reset or similar, that might be a desirable feature. But I use default acls and certainly wouldn't want them cleared as a normal action when there are no replacing acls specified.
Re: [Bug-tar] GNU tar generates malformed Pax attributes
Sorry for responding to this so far after the initial posting, but it caught my eye... Joerg Schilling wrote: Tim Kientzle kient...@acm.org wrote: Quoting from ?IEEE Std 1003.1, 2013 Edition? http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html An extended header shall consist of one or more records, each constructed as follows: %d %s=%s\n, length, keyword, value The extended header records shall be encoded according to the ISO/IEC 10646-1:2000 standard UTF-8 encoding. The length field, blank, equals-sign, and newline shown shall be limited to the portable character set, as encoded in UTF-8. It is not entirely clear, but the value is not listed in the list of items that need to be UTF-8 encoded. Even though it says the extended header records shall be encoded in UTF-8, it appears that the following line is designed to clarify which fields are to be encoded in UTF-8. As others point out, it would be difficult to presume that POSIX was dictating that the value fields (or the keyword) fields could only contain UTF-8 data. I suppose I?ll have to rework libarchive?s pax parser to tolerate this. It would be nice if GNU tar could avoid such brokenness in the future. This is definitely a bug in gtar and I hope that not many archives exist with this problem. I'm not sure it is a bug, but I'm sure only a POSIX lawyer could really say one way or another...;-) Given the fact that star includes portable ACL support since 2001 and Linux xattr support since 2003, I would guess that a typical user of these features uses star instead of gtar. Not me. star dumps core on my system. Has for the past 4 releases of SUSE Linux (13.1, 12.3, 12.2, 12.1). Last worked ~ 11.3 or so. Not only did it dump core, but it proceeded to finish in the main thread and return a success. Only the backtrace listing gave me a head's up (appended to this message). BTW: is this feature in use with gtar? I recently tried to compile gtar on Solaris and it does not seem to support ACLs even though the Changelog claims such support. Given the fact that star includes portable ACL support since 2001 and Linux xattr support since 2003, I would guess that a typical user of these features uses star instead of gtar. Regarding the problem: It should be obvious that it is an implementation detail that Linux internally stores e.g. ACLs as system xattrs and that related data must not appear in an archive. Star of course excludes such data from the archive. We are talking about dumping Xattrs, not ACLs. It would be a bug to convert the file system Xattrs into some ACL-interpretation. That interpretation depends on the OS, and is not part of the POSIX standard. I.e. Linux should store exactly the Xattrs that are there -- not some pseudo ACL interpretation -- as that may be wrong. If star is not storing the Xattrs as listed, I don't see how that would be correct. For the same reason, it is most likely wrong to archive the extended attribute files SUNWattr_ro and SUNWattr_rw that appear e.g. in ZFS even though I expect the content to be portable across OpenSolaris, FeeeBSD and Linux. Note: the content of these files is created from libnvpair and thus could be called documented. --- Xattrs are NOT portable between OS's. Sun's ACL's are not portable to other OS's -- not Linux, and not Windows. To try to convert them to some portable format would be very bad, since they don't map 1:1 and anything that isn't exactly the same would be a security flaw. The next task for star is to implement NFSv4 ACL support for FreeBSD. From what I found in the net, there is currently no compatible NFSv4 ACL support on Linux, so it currently seems to be a feature that only can be supported on OpenSolaris and FreeBSD. Because SunOS didn't follow the only standards [sic] that were in existence at the time (the withdrawn POSIX standard). NOTE: it was withdrawn due to a lack of quorum. They some number of signatures (say 100 or 50) to ratify it as a standard. By the time the document was finished, most of the members that had started out had dropped out (unix companies were going under around that time). Does anyone know whether there is another OS that implements the withdrawn POSIX draft for xattrs? IRIX from sgi. In particular from: (emphasis, below, mine): http://techpubs.sgi.com/library/manuals/4000/007-4273-003/sgi_html/apa.html The interface with the kernel (system calls) for Access Control Lists (ACLs) and Extended Attributes (EAs) is different in XFS/Linux compared with IRIX. *** However, the user level libraries for both ACLs and EAs are exactly the same in Linux and IRIX. Thus ACL or EA application code can be exactly the same. Uninteresting core dump of star-1.5final-61.1.2.x86_64 from SuSE 13.1 follows: *** buffer overflow detected ***: star terminated
Re: [Bug-tar] tar doc question: what is hard-dereference; how is it done?
I usually look in the manpage. There's not even a link to the URL, below, that you mention. So how would people know to look there, though I think the manpage is a better place for information like that. Thank you for the link -- I'd never seen that feature or switch before. Instead of going this route with hard-links, maybe tar might pre-process the directory tree into its own memory, or as a .dir-some-dt-stamp at the root of the tree just for things like hardlinks; or) store the inode of multi-linked files that would act like a symbolic-identifier within the tar-archive, requiring multi-linked files to be extracted with any partial-extract, or) building on the previous... create full copies of hard-linked files with the inode#-as-sym that can be replaced w/hardlinks on extraction. That way either copy could be extracted separately, and only in the presence of both would they be hardlinked. I don't usually use the partial extract option which may be why I've never noticed this behavior... Thanks again for the ptr. On 12/6/2013 4:44 PM, Paul Eggert wrote: Linda A. Walsh wrote: What is this switch suppose to do? To answer questions like these, I suggest looking at the tar documentation. Here's a URL: http://www.gnu.org/software/tar/manual/html_node/hard-links.html
Re: [Bug-tar] tar doc question: what is hard-dereference; how is it done?
On 12/7/2013 9:00 AM, Sergey Poznyakoff wrote: Perhaps it might, but this would mean inventing a new archive format or extending one of the supported ones. Both ways would produce compatibility problems with prior releases of the GNU tar as well as with other tar implementations. I can't see any gain which would justify our going into such troubles. The gain was stated under option 3 in the 2nd note. either copy could be extracted separately, and only in the presence of both would they be hardlinked. Meaning any linked copy of a file would retrieve the file -- the same as on hard disk and only hardlinks between extracted files that are restored would be restored. That way you don't have the problem that the current design implies. That is if you use the follow-hardlinks option, you won't get hard linked files on the destination. There's no way, if I understand, to both restore hard links and support partial extractions with the current method. I listed ways around that. To support both partial extractions and restore hard links without problems and have a tar that transparently, just works, no matter which way it is used, would be the motivation. Or you can document the shortcomings due to the inaccurate model of the file system gnu-tar uses -- which is the route currently being taken. Other than something that works without exceptions, I can see no reason to justify such a change...
[Bug-tar] tar has PROGRAM ERROR:
On openSuse, w/tar-1.26-17.1.x86_64, I tried using the --portability flag to get only older headers.. Instead I got: tar cf out --portability P-1.1.4/ tar: --old-archive: (PROGRAM ERROR) Option should have been recognized!? Try `tar --help' or `tar --usage' for more information. tar bug or platform (opensuse) bug?
[Bug-tar] tar and file meta data....
Saw this comment in the precompression thread: [Tim]zip does exactly what you describe, and newer Info-Zip versions do a good job of preserving Unix permissions, timestamps, etc. Instantly, I think, but does it handle alternate data forks as exist on on Windows, Linux and Apple.? Windows NTFS has 'extended data' or streams, ACLs, and sensitivity/integrity label. Linux has had them since the early 90's when XFS came with them. They've been added to other file systems since then. And Apple's file system also supports alternate data/resource forks. zip finally caught up to doing some level of permissions (don't know to what level)... the core utils like 'cp' copy ACL's extended attributes -- by default when you use the -a / archive switch. rsync is a bit behind the times, their '-a' archive mode only backs up partial data, You have to add -HAX to the cmd line to get. It can even store incompat ACL's and such in a generic form for later restoration! star has (a multi tar format tar prog from the 90's has had support for ACLS/extended attrs -- everything about a file for at least 10 years on systems that supported them (unfortunately, it coredumps on the new version of suse (11.4) I upgraded[sic] my home server to from 11.2 -- which had been stable... so... plans for tar? When will it be usable for archiving again? -- I mean it's good for content transport, but permissions -- not so good...since most of win's permissions are in the ACL's. Is it, *gulp*, already done, and I just need to update to the latest release? suse's 11.4 came w/1.26
[Bug-tar] .lz also can be extension for .lzma files
Reading the .lz comments reminded me -- currently, if one does a tar caf sam.tlz Dirname it creates a lzma compressed tar archive. I hope this won't be broken by adding .lzip support. But I wanted to mention -- I don't know if it is fixed in the newer versions, but after doing a tar as quoted above, an attempt to extract the archive with the same method doesn't work: # tar xaf ../sam.tlz tar: This does not look like a tar archive tar: Skipping to next header tar: Error exit delayed from previous errors Notably, the Gnu file command doesn't recognize lzma archives as archives -- it just thinks they are 'data' (file V 4.24). Could this be related? Note - if you specify the --lzma switch, it compresses fine, so nothing wrong w/the archive, just tar doesn't 'auto-recognize' it.. As for successors of lzma, I may be confused, but hasn't the original author released his own successor? 7z? It's got the highest compression ratio, and stores files in random-access archives like zip archives. The linux version is parallelized and automatically will start N parallel compression threads when storing a directory -- it even runs multiple threads on single large files as the format lends it self to trying different parameters to achieve optimal results. It's impressive, though the command line syntax is more compatible with zip/unzip than gzip/bzip. Though it's usually a score or more bytes smaller than the 'competitors', I don't usually use it unless I'm compression large files simply due it not working with pipes and its syntax. -l
Re: [Bug-tar] --remove-files deletes files even if archive couldn't be created, when used with compression
Sergey Poznyakoff wrote: Bart Botta 000...@gmail.com ha escrit: $ ../src/tar cfv a --remove-files b ../src/tar: a: Cannot open: Is a directory Argument to the -f option cannot be a directory. See... That doesn't address the problem. I feel that tar should be more robust in handling 'source' files that are going into an archive, and should NOT delete the source if can't open (and move them) into the target archive. Similarly, I wouldn't expect 'mv' to delete a file from a source dir unless it could successfully move it to a target directory. No?
Re: [Bug-tar] [Fwd: Bug#543913: tar tf uses read() instead of lseek(), making it slow]
Tim Kientzle wrote: Lars Stoltenow wrote: On Thu, Aug 27, 2009 at 06:29:04PM +0300, Sergey Poznyakoff wrote: The --seek (-n) command line option instructs tar to use lseeks instead of reads. Use it. Then probably GNU tar should detect automatically if a file is seekable or not. This was discussed about two years ago: http://www.mail-archive.com/bug-tar@gnu.org/msg01602.html Yes, it was, and the final resolution, ... Subject: Re: [Bug-tar] GNU tar, star and BSD tar speed comparision On Tue, 2007-Oct-23, 03:32, Sergey Poznyakoff wrote: Tim Kientzle ha escrit: Sergey Poznyakoff Tue, 23 Oct 2007 02:31:28 -0700 wrote: When reading uncompressed tar archives stored in regular files, bsdtar uses lseek() operations to skip over the bodies of files. As a side note, the similar feature in GNU tar is enabled using seek option. Is there a reason GNU tar doesn't enable this by default for all regular files? I forgot to implement it :) But I'll fix that soon. Regards, Sergey Not sure of exact circumstances, but should the user have encountered the problem if he was using a fixed version?
[Bug-tar] RFE? tar doesn't copy file attributes or ACL's (full permissions)
Had a 'feature deficit' that I thought of for tar...;^) Would it be possible to get tar to copy (or ignore) file attributes, and to copy ACL's if they exist ( maybe via an option)? Some attr's I really like to keep on files, and end up having to use something a host of different utils (depending on fs and os), but common examples are the '+d 'don't dump' attribute -- something I usually set on large multi-gig scratch files I'm using for testing, or temporary copies of DVD's on my disk (certainly don't need to back up such things as the original is the DVD -- much better than trying to use backup space). ACL's -- those would be most useful for me under Cygin-Windows. Cygwin uses Windows ACL's itself for emulating the standard unix rwx permisions for groups, but native ACL's are also set for existing files that it would be nice if they could be kept in a backup. (Since Window's native backup seems to die about 165G into its backup attempt (I think it is actually 'finished', and dies during final 'housekeeping'). I'd rather save it in a standard format if I could (i.e. tar)... Of course on Win, it has underlying file attribs as well, (HSRA...) Dunno how possible...but certainly the ext2+xfs 'd' attribs and file ACLS's -- it'd be up to the support libraries to actually be available to be able to pull the values in so 'tar' could store them. _Maybe_ for Backwards compat such info could be stored ahead of a file in some variation of the filename? Like if fn=filename, then store attribs in .filename#.%attrib%.hex-numbered ext, if need to avoid fn collisions? Just a thought...? Linda