I wanted to see how much AI a phone with only 4 GB could handle.smartphone llm 01

I've been testing various small offline AI models on my iPhone SE 2022 (third generation) with iOS 16.4.2, which is approximately 1 GB in size. So far, only the DeepSeek R1 1.5B 4-bit model with reasoning has caused this device to crash. I even manually unloaded the previous AI model that I had been using before attempting to load this one.

No other apps were running in the background, and force restarting my iPhone didn't resolve this issue.

A lack of storage space isn't causing this issue because I have over 20 GB of free space on this device.

I can rule out heat, since the device had been off for hours before I loaded the model again.

It appears that this model has unusually high temporary memory requirements during initialization. Although NVMe-based NAND flash storage is generally adequate for initial loading, it seems insufficient for this particular model.

As a comparison, Phi-4 Mini 3.8B (4-bit) with reasoning struggles to run with 8 tokens per second, but at least it doesn't crash this device.

Models without reasoning, such as Quen 3.5 2B (4-bit) and LFM 2.5 1.2B (4-bit), run relatively smoothly on this device at 20 to 40 tokens per second, respectively.