sysutils/freesbie: 400,000x optimization & 2x optimization2006年05月06日 08時57分00秒

FreeSBIE 2 is a well written program, as I've mentioned. Although I never used version 1, version 2 was advertised with its modulability. In fact, one of the reason why I think this design is pretty good is that it takes good advantages of the FreeBSD base build system.

However, there are yet places for improvement. In img.sh and clonefs.sh, dd is used to create a file for a md-device. Seek option helps disk writes of dd. Since dd was intended to create a large file with zero padded, by seeking, UFS creates a sparse file.


diff -urN freesbie2.org/scripts/clonefs.sh freesbie2/scripts/clonefs.sh
--- freesbie2.org/scripts/clonefs.sh    Tue Jan 10 18:30:07 2006
+++ freesbie2/scripts/clonefs.sh        Thu May  4 02:28:41 2006
@@ -35,7 +35,7 @@
     # Find the total dir size and initialize the vnode
     DIRSIZE=$(($(du -kd 0 | cut -f 1)))
     FSSIZE=$(($DIRSIZE + ($DIRSIZE/5)))
-    dd if=/dev/zero of=${UFSFILE} bs=1k count=${FSSIZE} >> ${LOGFILE} 2>&1
+    dd if=/dev/zero of=${UFSFILE} bs=1k count=1 seek=$((${FSSIZE} - 1)) >> ${LOGFILE} 2>&1

     DEVICE=/dev/$(mdconfig -a -t vnode -f ${UFSFILE})
     newfs -o space ${DEVICE} >> ${LOGFILE} 2>&1
diff -urN freesbie2.org/scripts/img.sh freesbie2/scripts/img.sh
--- freesbie2.org/scripts/img.sh        Wed Nov 16 19:21:42 2005
+++ freesbie2/scripts/img.sh    Thu May  4 02:31:41 2006
@@ -33,7 +33,7 @@
 SECTS=$((${CYLINDERS} * ${CYLSIZE}))

 echo "Initializing image..."
-dd if=/dev/zero of=${IMGPATH} count=${SECTS} >> ${LOGFILE} 2>&1
+dd if=/dev/zero of=${IMGPATH} count=1 seek=$((${SECTS} - 1)) >> ${LOGFILE} 2>&1

 # Attach the md device
 DEVICE=`mdconfig -a -t vnode -f ${IMGPATH} -x ${SECTT} -y ${HEADS}`

Since these dd creaets about a 200MB file in img.sh and the default block size, bs, is 512 byte, it makes about 400,000 times faster to create a file!!! The size of the file that clonefs.sh creates depends on the size of the system.

Although I didn't add that to this patch, it is a better idea to do "rm -rf ${filename}" before dd. Then, dd will not try to overwrite an existing directory or do not overwrite an existing file. If an existing file is smaller than the size dd is writing, there is no significant problem because newfs will wipe out everything and nothing will be visible via md-mount. On the other hand, if a larger file exists, for example, 1 GB file, dd will be no-op and FreeSBIE will use 1 GB file to create a disk image. This is a bad idea.

In addition, img.sh will do another "dd" to create a disk image. This operation is not really necessary.


diff -urN freesbie2.org/scripts/img.sh freesbie2/scripts/img.sh
--- freesbie2.org/scripts/img.sh        Wed Nov 16 19:21:42 2005
+++ freesbie2/scripts/img.sh    Thu May  4 02:33:47 2006
@@ -37,7 +37,6 @@

 # Attach the md device
 DEVICE=`mdconfig -a -t vnode -f ${IMGPATH} -x ${SECTT} -y ${HEADS}`
-rm -f ${IMGPATH}

 echo "g c${CYLINDERS} h${HEADS} s${SECTT}" > ${TMPFILE}
 echo "p 1 165 ${SECTT} $((${SECTS} - ${SECTT}))" >> ${TMPFILE}
@@ -58,10 +57,6 @@
 echo "/dev/ufs/${FREESBIE_LABEL} / ufs ro 1 1" > ${TMPDIR}/etc/fstab
 umount ${TMPDIR}
 cd ${LOCALDIR}
-
-echo "Dumping image to ${IMGPATH}..."
-
-dd if=/dev/${DEVICE} of=${IMGPATH} bs=64k >> ${LOGFILE} 2>&1

 mdconfig -d -u ${DEVICE}

${IMGPATH} generated by dd is attached to a md-device IS A DISK IMAGE ITSELF. There is no need to copy back from the md-device. Along the processes of img.sh, there are two time-consuming-processes: one is extracting files onto the md-mounted filesystem and the other is copying from md-device to a file. No-dump patch removes the later; as a result, img.sh takes about half of time compare to befeore.

Indeed, this 2x optimization is quite significant. This reduces major part of the process time, the result is actually greater than dd-seek patch. Although dd-seek makes 400,000 times faster, the original program do not spend much time there. In fact, in today's standard disks, writing from /dev/zero to a file is quite fast. On the other hand, no-dump patch reduces more time because this is a copy from the same disk to the same disk.

It is yet important to optimize on less-time-spent spots. It is more important to improve major-time-spent spots. Even if the ratio of speedup is lesser, the overall result will be greater.

Patches: dd-seek patch and no-dump patch.

next

コメント

コメントをどうぞ

※メールアドレスとURLの入力は必須ではありません。 入力されたメールアドレスは記事に反映されず、ブログの管理者のみが参照できます。

※なお、送られたコメントはブログの管理者が確認するまで公開されません。

名前:
メールアドレス:
URL:
コメント:

トラックバック

このエントリのトラックバックURL: http://uyota.asablo.jp/blog/2006/05/06/354427/tb

※なお、送られたトラックバックはブログの管理者が確認するまで公開されません。