tgies | information theory is hard let's go shopping

that does not make sence!

ok lets look at audio using binary as everything in computing is based on binary.

lets for a moment assume the following binary is a wave file.

original wave = 101001000011110100101010010011010101001010

i understand that its way too small to be a wave file but lets just say it is for a moment!

after compression lets assume the following is the mp3 binary

compressed mp3 = 1010010

this should then leave you with the following leftover wave binary

leftover or disgarded wave information = 00011110100101010010011010101001010

why is it not possible to add the mp3 binary with the disgarded wave to recreate the original file again? like so

1010010 + 00011110100101010010011010101001010

Flat | Top-Level Comments Only

From:

999kcelfe.livejournal.com

GLORIOUS FM RADIO SOUND!

I'll just keep FLACS, and transcode
to MP3 when I want to listen to
the file on my cheap MP3 player.
(the one I just ordered)
Why bother with a difference file?

From:

tgies.livejournal.com

Especially since the way FLAC works already basically implements that "difference file" concept. Only it was actually designed to, so it actually works, whereas trying to get MP3 to do something like that is pretty much fraught with peril.

From:

999kcelfe.livejournal.com

difference file in FLAC?

I always thought it was just lossless compression.
(and for lossless compression, it works pretty well,
anywhere from about 25-75% of the original size.)

From:

tgies.livejournal.com

It's not really a separate file. Here's how it works.

FLAC chops the audio up into blocks, and then finds a mathematical function which closely approximates the content of each block. To oversimplify a bit for sake of example, say I had a sine wave -- instead of storing each individual point in the sine wave, for a total of thousands of bytes, FLAC will just store a few bytes meaning "sin(x)" or whatever.

However, you're not going to find a simple function which exactly represents the original waveform, so FLAC finds the difference (called the "residual") between its approximation and the actual data, and stores that as well, so when you decompress your FLAC you get the exact original signal back. That's the difference data that I'm talking about.

There's technical documentation on the FLAC website and probably on Wikipedia explaining this in more detail, if you're interested.

From:

tgies.livejournal.com

http://flac.sourceforge.net/format.html#blocking

This part of the page and the next few paragraphs provide an overview of how the codec works. Basically:

Step 1: Chop up the audio stream into blocks that are easier to handle and transport.
Step 2: Find anything which is duplicated on both the left and right channels (in the "center channel") and discard the duplicated information. This is typically done by converting the audio from a left-right channel representation into a representation of the "center" channel and the difference between the left and right channels.
Step 3: For each block, pick an algorithm which closely approximates the content of the block. This information can then be stored as just a few bytes, representing the identity of the algorithm chosen and the coefficients to plug into it.
Step 4: Find the difference between the real signal and the approximation made in step 3 and store this in a simpler compressed form which is somewhat similar in technique to the compression used in, say, a zip file.

Edited Date: 2008-05-04 05:07 am (UTC)

From:

korgmeister.livejournal.com

WTF part of "MP3 is a proprietary algorithm and they don't want to make it any easier to reverse-engineer than it already is" does this guy not grok?

(deleted comment)

From:

tgies.livejournal.com

I remember the Sega Genesis stored samples as 4-bit PCM

Yeah, one nybble per sample

That was really bad

They used this sort of half-assed dithering thing to try and upconvert to 8-bit for output but it wound up sounding worse than if they had just padded the samples out with zeroes because the random number generator they were using for the dithering had periodic components

hope you like random ringing sounds

(deleted comment)

From:

tgies.livejournal.com

Not really

dithering only particularly helps when downconverting

From:

tgies.livejournal.com

MP3 isn't too proprietary, considering that nobody is even sure who owns it, and in any case the organizations who lay claim to it have said that free software encoders are OK. The main problem here is that Hydrogen Audio is full of people with no concept of information or audio or anything having angry arguments about CDs and shit all day long, and it's kind of amusing. I can't really complain about it, since ostensibly they're all slowly learning a thing or two, but it's still pretty funny sometimes when they come up with hare-brained ideas in total ignorance of the way anything works. I think the Nyquist-Shannon sampling theorem gets "disproven" an average of once a week on there now.

From:

darnn.livejournal.com

Arr!

From:

r-transpose-p.livejournal.com

Um, I'm not entirely sure what you're talking about, but the post you linked to made sense to me. I didn't read the discussion following it.

Lets say mp3 converts the time series of the original wave file into the coefficients of some basis, and then takes the most significant basis elements, and writes out their coefficients in some compressed manner. All the user appears to be asking for is a second file which stores the coefficients for the dropped (i.e. insignificant) basis elements in a compressed manner. And, of course, this would only work if there were a program to take two compressed files of basis element coefficients, and combine them into one file of basis element coefficients (possibly converting the signal back into the time series domain in the process). Whether or not the user is under the delusion that concatenating these compressed basis element coefficient files would give back the original time series wave data is another question altogether, but one can easily imagine a tool to do what the user, presumably, wants.

This makes 100% perfect sense to me.

From:

r-transpose-p.livejournal.com

p.s. I have no idea how mp3 actually works, it just makes sense that whatever it does could be thought of as "convert the time series for sound into some set of basis signals, then drop the lowest coefficients from the representation in the new basis"

Correct me if I'm wrong...

From:

tgies.livejournal.com

Something like that, with additional specializations for the audio application as I mentioned below, but the specific way MP3 does it makes it very impractical to try to bolt on some kind of residual-based lossless functionality. We already have perfectly good lossless compression which is actually designed for that.

From:

tgies.livejournal.com

I was referring specifically to the post I quoted, which is some way down the thread to which I linked.

What you're describing makes perfect sense (except for the minor detail that storing the coefficients for the information you've dropped is, to use the technical term, Really Hard, which is kind of why they've been dropped in the first place), but it's unfortunately not what anyone in that thread is actually suggesting (and not quite how MP3 works, but that's beside the point).

The main problem with the idea in general is some pretty basic information theory shit: The information that was discarded in compression in the first place has a really, really high degree of entropy -- that's why it was discarded. MP3 discards a little more than just really-entropic information, because it is psychoacoustically conscious and attempts to throw away any information at all that it expects you wouldn't be able to hear, but the main problem is that entropic information which was thrown away, both perceptible and imperceptible. Shannon shows that storing that entropy is going to cost us A Whole Lot of Bits. There exists, of course, lossless signal compression which stores the signal as predictive/redundant and residual/entropic components, but that's totally different, because we're talking about a system which was designed that way from the ground up, as opposed to bolting some kind of weird-ass functionality onto MP3.

There is also the fact that all MP3 decoders produce different output complicating matters further. MP3 quantization is weird in more ways than I care to explain, and the frequency-domain representation basically makes you guess the phase of any given component.

Edited Date: 2008-05-04 04:34 am (UTC)

Flat | Top-Level Comments Only

Profile

tgies

September 2016

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Page Summary

Style Credit

Style: by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Apr. 29th, 2026 09:27 pm

Gen. Romulus T. Eggweight

chief strategist of ground operations in the War on Christmas

information theory is hard let's go shopping

information theory is hard let's go shopping

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

September 2016

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags