Name: AtlasFlux AI
Availability: InStock
Rating: 5 (8 reviews)
Author: Muhammad Nabil

ABSTRACT

Large language models (LLMs) have revolutionised natural language processing, yet their performance in underrepresented languages and cultural contexts remains severely limited. This paper presents atlasflux-qwen-7b-1.0, a fine-tuned variant of the Qwen2.5-7B model adapted specifically for Malaysian Bahasa Melayu, colloquial slang (Manglish), and local cultural knowledge.

The model was fine-tuned using LoRA (Low-Rank Adaptation) on a custom-built dataset of 2,968 instruction-response pairs. We discuss the training methodology, dataset construction, evaluation strategy, deployment challenges, and cost-effective inference solutions. The model is available under the Apache 2.0 license.

1. Introduction

1.1 The Problem

The rapid advancement of LLMs has largely favoured high-resource languages, leaving linguistically diverse regions underserved. Malaysia, with its rich tapestry of Bahasa Melayu, Manglish, and numerous regional dialects, faces a significant gap in AI models that truly understand local communication nuances.

1.3 Project Goals

To fine-tune an open-source 7B-parameter LLM (Qwen2.5-7B) on a curated, high-quality dataset of Malaysian-centric examples.
To minimise computational cost by using LoRA and 4-bit quantisation, enabling training on a single Google Colab T4 GPU.
To release the fine-tuned model under a permissive license (Apache 2.0) to encourage commercial adoption.

2. Model Architecture

2.1 Base Model Specifications

Total Parameters

7.61 billion

Non-embedding Parameters

6.53 billion

Number of Layers

Attention Heads (Query)

Key-Value Heads (GQA)

Context Length

131,072 tokens

Generation Limit

8,192 tokens

Vocabulary Size

152,064

Architecture

RoPE, SwiGLU, RMSNorm

Multilingual Support

29+ languages

3. Dataset Construction

3.1 Design Philosophy

We constructed a custom dataset comprising 2,968 instruction-response pairs in JSONL format, carefully curated to reflect three categories:

~70%

Bahasa Melayu Standard

Educational content, general knowledge, technical explanations

~20%

Colloquial Slang / Manglish

Everyday conversational phrases, informal expressions, 'mamak talk'

~10%

Regional Dialects

Kelantanese, Kedahan, Terengganuan, Johorean, Sabahan, Sarawakian

4. Fine-Tuning Methodology

4.1 Parameter-Efficient Fine-Tuning (PEFT) with LoRA

To minimise computational cost, we employed LoRA. Only 0.26% of the total parameters (approx. 20 million out of 7.6 billion) were trained.

Rank (r)16

LoRA Alpha16

Target Modulesq_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Dropout0.0

Biasnone

Gradient CheckpointingEnabled (Unsloth)

4.2 Training Setup

Batch Size per Device1

Gradient Accumulation8

Effective Batch Size8

Learning Rate2e-4

OptimizerAdamW 8-bit

Warmup Steps5

Max Steps500

Max Sequence Length2048 tokens

5. Evaluation

5.1 Qualitative Evaluation

Prompt

Siapa yang membina AtlasFlux AI?

Model Output

AtlasFlux AI dibina oleh Muhammad Nabil, seorang pelajar UiTM Cawangan Melaka Kampus Alor Gajah dalam Diploma Pengurusan Teknologi Pejabat.

Prompt

Apa itu Rainspeed Labs?

Model Output

Rainspeed Labs Solutions adalah syarikat di sebalik pembangunan AtlasFlux AI, beroperasi di Bukit Gambir, Tangkak, Johor.

Prompt

Aku nak gi mana? (Kelantan)

Model Output

(Generated appropriate dialect response with correct grammar)

6. Deployment & Cost Analysis

6.1 Common Obstacles

GPU memory constraints

Solution: Use 4-bit quantisation (QLoRA) to reduce memory to ~4-5 GB.

Outcome: Successful; inference possible on T4.

Cold start latency

Solution: Implement keep-alive pings or use dedicated endpoints.

Outcome: Acceptable for low-traffic phases.

Platform inconsistency

Solution: Prefer serverless, per-token pricing platforms or self-host with vLLM.

Outcome: No ideal solution found; project discontinued before full deployment.

6.2 Cost-Effective Inference

Provider	Input Price	Output Price
Together AI	$0.27 – $0.50	$0.40 – $3.00
Groq	$0.05 – $0.59	$0.08 – $0.79
AWS Bedrock	$1.50 – $25.00	–
Self-hosted (T4)	~$0.50/hour	–

7. Technical Obstacles

Key technical hurdles included memory-related errors (solved via 4-bit quantisation), dataset loading failures (solved via JSONL conversion), and PEFT version conflicts (solved by pinning versions: peft==0.10.0, transformers==4.40.0).

8. Availability & Licensing

Repositoryrainspeed/atlasflux-qwen-7b-1.0

LicenseApache 2.0

Base ModelQwen/Qwen2.5-7B-Instruct

9. Future Work

Expand the dataset

to at least 10,000 instruction-response pairs, including more dialect examples.

Benchmark against Malay-specific tasks

(e.g., MalayMMLU, Malay sentiment analysis) to quantify performance improvements.

Implement safety filtering

(e.g., using content moderation APIs) to reduce harmful or biased outputs.

Provide quantised GGUF versions

(4-bit, 8-bit) for local deployment with llama.cpp or Ollama.

Integrate with RAG

to ground answers in a trusted knowledge base, reducing hallucinations.

10. Conclusion

AtlasFlux demonstrates that fine-tuning a medium-sized LLM (7B parameters) with a modest but high-quality dataset can produce a model capable of understanding Malaysian cultural and linguistic nuances at a fraction of the cost of training from scratch.

Final note: For production applications, retrieval-augmented generation (RAG) using a small, cost-effective language model may be more practical than deploying a custom-fine-tuned 7B model. The author has since pivoted to RAG for the ai.atlasflux.my website.

References

1. Qwen Team. (2024). Qwen2.5-7B-Instruct Model Card.
Hugging Face • huggingface.co/Qwen/Qwen2.5-7B-Instruct
2. Wang, C., et al. (2024). Qwen2.5 Technical Report.
arXiv:2407.10671
3. YTL AI Labs. (2025). Malaysia launches first homegrown LLM, Ilmu.
The Edge Malaysia
4. Unsloth AI. (2025). Unsloth Documentation.
docs.unsloth.ai • docs.unsloth.ai
5. Hugging Face. (2026). Inference Endpoints Documentation.
huggingface.co • huggingface.co/docs/inference-endpoints

AtlasFlux: Fine-Tuning Qwen2.5-7B for Malaysian Cultural and Linguistic Contexts

1. Introduction

1.1 The Problem

1.3 Project Goals

2. Model Architecture

2.1 Base Model Specifications

3. Dataset Construction

3.1 Design Philosophy

4. Fine-Tuning Methodology

4.1 Parameter-Efficient Fine-Tuning (PEFT) with LoRA

4.2 Training Setup

5. Evaluation

5.1 Qualitative Evaluation

6. Deployment & Cost Analysis

6.1 Common Obstacles

6.2 Cost-Effective Inference

7. Technical Obstacles

8. Availability & Licensing

9. Future Work

10. Conclusion

References